linux-kernel - Re: performance "regression" in cfq compared to anticipatory, deadline and noop

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080513184057.GU16217@kernel.dk>
Date:	Tue, 13 May 2008 20:40:57 +0200
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Matthew <jackdachef@...il.com>
Cc:	Kasper Sandberg <lkml@...anurb.dk>,
	Daniel J Blueman <daniel.blueman@...il.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: performance "regression" in cfq compared to anticipatory, deadline and noop

On Tue, May 13 2008, Jens Axboe wrote:
> On Tue, May 13 2008, Matthew wrote:
> > On Tue, May 13, 2008 at 3:05 PM, Jens Axboe <jens.axboe@...cle.com> wrote:
> > >
> > > On Tue, May 13 2008, Matthew wrote:
> > >  > On Tue, May 13, 2008 at 2:20 PM, Jens Axboe <jens.axboe@...cle.com> wrote:
> > >  > >
> > >  > > On Sun, May 11 2008, Kasper Sandberg wrote:
> > >  > >  > On Sun, 2008-05-11 at 14:14 +0100, Daniel J Blueman wrote:
> > >  > >  > > I've been experiencing this for a while also; an almost 50% regression
> > >  > >  > > is seen for single-process reads (ie sync) if slice_idle is 1ms or
> > >  > >  > > more (eg default of 8) [1], which seems phenomenal.
> > >  > >  > >
> > >  > >  > > Jens, is this the expected price to pay for optimal busy-spindle
> > >  > >  > > scheduling, a design issue, bug or am I missing something totally?
> > >  > >  > >
> > >  > >  > > Thanks,
> > >  > >  > >   Daniel
> > >  > [snip]
> > >  > ...
> > >  > [snip]
> > >  > >  >
> > [snip]
> > 
> > ...
> > 
> > [snip]
> > >  > well - back to topic:
> > >  >
> > >  > for a blktrace one need to enable  CONFIG_BLK_DEV_IO_TRACE , right ?
> > >  > blktrace can be obtained from your git-repo ?
> > >
> > >  Yes on both accounts, or just grab a blktrace snapshot from:
> > >
> > >  http://brick.kernel.dk/snaps/blktrace-git-latest.tar.gz
> > >
> > >  if you don't use git.
> > >
> > >  --
> > >  Jens Axboe
> > >
> > >
> > 
> > unfortunately that snapshot wouldn't compile for me because of an error,
> > I used the in-tree provided snapshot from portage: 0.0.20071210202527
> > I hope that's ok, too;
> 
> That's fine, it doesn't really matter. But I'd appreciate if you sent me
> the compile error in private, so that I can fix it :-)
> 
> > attached you'll fine the btrace (2 files) as a tar.bz2 package
> > 
> > from cfq
> > 
> > here the corresponding hdparm-output:
> > hdparm -t /dev/sdd
> > 
> > /dev/sdd:
> >  Timing buffered disk reads:  152 MB in  3.02 seconds =  50.38 MB/sec
> > 
> > blktrace /dev/sdd
> > Device: /dev/sdd
> >   CPU  0:                    0 events,     4136 KiB data
> >   CPU  1:                    0 events,       11 KiB data
> >   Total:                     0 events (dropped 0),     4146 KiB data
> > 
> > and the corresponding output of anticipatory and attached the btrace of it:
> > 
> > hdparm -t /dev/sdd
> > 
> > /dev/sdd:
> >  Timing buffered disk reads:  310 MB in  3.02 seconds = 102.76 MB/sec
> > 
> > blktrace /dev/sdd
> > Device: /dev/sdd
> >   CPU  0:                    0 events,     7831 KiB data
> >   CPU  1:                    0 events,      132 KiB data
> >   Total:                     0 events (dropped 0),     7962 KiB data
> 
> They seem to start out the same, but then CFQ gets interrupted by a
> timer unplug (which is also odd) and after that the request size drops.
> On most devices you don't notice, but some are fairly picky about
> request sizes. The end result is that CFQ has an average dispatch
> request size of 142kb, where AS is more than double that at 306kb. I'll
> need to analyze the data and look at the code a bit more to see WHY this
> happens.

Here's a test patch, I think we get into this situation due to CFQ being
a bit too eager to start queuing again. Not tested, I'll need to spend
some testing time on this. But I'd appreciate some feedback on whether
this changes the situation! The final patch will be a little more
involved.

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index b399c62..ebd8ce2 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1775,18 +1775,8 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq,
 
 	cic->last_request_pos = rq->sector + rq->nr_sectors;
 
-	if (cfqq == cfqd->active_queue) {
-		/*
-		 * if we are waiting for a request for this queue, let it rip
-		 * immediately and flag that we must not expire this queue
-		 * just now
-		 */
-		if (cfq_cfqq_wait_request(cfqq)) {
-			cfq_mark_cfqq_must_dispatch(cfqq);
-			del_timer(&cfqd->idle_slice_timer);
-			blk_start_queueing(cfqd->queue);
-		}
-	} else if (cfq_should_preempt(cfqd, cfqq, rq)) {
+	if ((cfqq != cfqd->active_queue) &&
+		   cfq_should_preempt(cfqd, cfqq, rq)) {
 		/*
 		 * not the active queue - expire current slice if it is
 		 * idle and has expired it's mean thinktime or this new queue

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/