linux-kernel - Re: [RFC,PATCH] cfq-iosched: improve async queue ramp up formula

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20091127114847.GZ8742@kernel.dk>
Date:	Fri, 27 Nov 2009 12:48:47 +0100
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Corrado Zoccolo <czoccolo@...il.com>
Cc:	Linux-Kernel <linux-kernel@...r.kernel.org>,
	Jeff Moyer <jmoyer@...hat.com>,
	Vivek Goyal <vgoyal@...hat.com>, mel@....ul.ie, efault@....de
Subject: Re: [RFC,PATCH] cfq-iosched: improve async queue ramp up formula

On Fri, Nov 27 2009, Corrado Zoccolo wrote:
> Hi Jens,
> let me explain why my improved formula should work better.
> 
> The original problem was that, even if an async queue had a slice of 40ms,
> it could take much more to complete since it could have up to 31
> requests dispatched at the moment of expiry.
> In total, it could take up to 40 + 16 * 8 = 168 ms (worst case) to
> complete all dispatched requests, if they were seeky (I'm taking 8ms
> average service time of a seeky request).
> 
> With your patch, within the first 200ms from last sync, the max depth
> will be 1, so a slice will take at most 48ms.
> My patch still ensures that a slice will take at most 48ms within the
> first 200ms from last sync, but lifts the restriction that depth will
> be 1 at all time.
> In fact, after the first 100ms, a new async slice will start allowing
> 5 requests (async_slice/slice_idle). Then, whenever a request
> completes, we compute remaining_slice / slice_idle, and compare this
> with the number of dispatched requests. If it is greater, it means we
> were lucky, and the requests were sequential, so we can allow more
> requests to be dispatched. The number of requests dispatched will
> decrease when reaching the end of the slice, and at the end we will
> allow only depth 1.
> For next 100ms, you will allow just depth 2, and my patch will allow
> depth 2 at the end of the slice (but larger at the beginning), and so
> on.
> 
> I think the numbers by Mel show that this idea can give better and
> more stable timings, and they were just with a single NCQ rotational
> disk. I wonder how much improvement we can get on a raid, where
> keeping the depth at 1 hits performance really hard.
> Probably, waiting until memory reclaiming is noticeably active (since
> in CFQ we will be sampling) may be too late.

I'm not saying it's a no-go, just that it invalidates the low latency
testing done through the 2.6.32 cycle and we should re-run those tests
before committing and submitting anything.

If the 'check for reclaim' hack isn't good enough, then that's probably
what we have to do.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/