[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49F0FA2F.5030808@cse.unsw.edu.au>
Date: Fri, 24 Apr 2009 09:30:55 +1000
From: Aaron Carroll <aaronc@....unsw.edu.au>
To: Corrado Zoccolo <czoccolo@...il.com>
CC: jens.axboe@...cle.com, Linux-Kernel <linux-kernel@...r.kernel.org>
Subject: Re: Reduce latencies for syncronous writes and high I/O priority
requests in deadline IO scheduler
Hi Corrado,
Corrado Zoccolo wrote:
> On Thu, Apr 23, 2009 at 1:52 PM, Aaron Carroll <aaronc@....unsw.edu.au> wrote:
>> Corrado Zoccolo wrote:
>>> Hi,
>>> deadline I/O scheduler currently classifies all I/O requests in only 2
>>> classes, reads (always considered high priority) and writes (always
>>> lower).
>>> The attached patch, intended to reduce latencies for syncronous writes
>> Can be achieved by switching to sync/async rather than read/write. No
>> one has shown results where this makes an improvement. Let us know if
>> you have a good example.
>
> Yes, this is exactly what my patch does, and the numbers for
> fsync-tester are much better than baseline deadline, almost comparable
> with cfq.
The patch does a bunch of other things too. I can't tell what is due to
the read/write -> sync/async change, and what is due to the rest of it.
>>> and high I/O priority requests, introduces more levels of priorities:
>>> * real time reads: highest priority and shortest deadline, can starve
>>> other levels
>>> * syncronous operations (either best effort reads or RT/BE writes),
>>> mid priority, starvation for lower level is prevented as usual
>>> * asyncronous operations (async writes and all IDLE class requests),
>>> lowest priority and longest deadline
>>>
>>> The patch also introduces some new heuristics:
>>> * for non-rotational devices, reads (within a given priority level)
>>> are issued in FIFO order, to improve the latency perceived by readers
>> This might be a good idea.
> I think Jens doesn't like it very much.
Let's convince him :)
I think a nice way to do this would be to make fifo_batch=1 the default
for nonrot devices. Of course this will affect writes too...
One problem here is the definition of nonrot. E.g. if H/W RAID drivers
start setting that flag, it will kill performance. Sorting is important
for arrays of rotational disks.
>> Can you make this a separate patch?
> I have an earlier attempt, much simpler, at:
> http://lkml.indiana.edu/hypermail/linux/kernel/0904.1/00667.html
>> Is there a good reason not to do the same for writes?
> Well, in that case you could just use noop.
Noop doesn't merge as well as deadline, nor does is provide read/write
differentiation. Is there a performance/QoS argument for not doing it?
> I found that this scheme outperforms noop. Random writes, in fact,
> perform quite bad on most SSDs (unless you use a logging FS like
> nilfs2, that transforms them into sequential writes), so having all
> the deadline ioscheduler machinery to merge write requests is much
> better. As I said, my patched IO scheduler outperforms noop on my
> normal usage.
You still get the merging... we are only talking about the issue
order here.
>>> * minimum batch timespan (time quantum): partners with fifo_batch to
>>> improve throughput, by sending more consecutive requests together. A
>>> given number of requests will not always take the same time (due to
>>> amount of seek needed), therefore fifo_batch must be tuned for worst
>>> cases, while in best cases, having longer batches would give a
>>> throughput boost.
>>> * batch start request is chosen fifo_batch/3 requests before the
>>> expired one, to improve fairness for requests with lower start sector,
>>> that otherwise have higher probability to miss a deadline than
>>> mid-sector requests.
>> I don't like the rest of it. I use deadline because it's a simple,
>> no surprises, no bullshit scheduler with reasonably good performance
>> in all situations. Is there some reason why CFQ won't work for you?
>
> I actually like CFQ, and use it almost everywhere, and switch to
> deadline only when submitting an heavy-duty workload (having a SysRq
> combination to switch I/O schedulers could sometimes be very handy).
>
> However, on SSDs it's not optimal, so I'm developing this to overcome
> those limitations.
Is this due to the stall on each batch switch?
> In the meantime, I wanted to overcome also deadline limitations, i.e.
> the high latencies on fsync/fdatasync.
Did you try dropping the expiry times and/or batch size?
-- Aaron
>
> Corrado
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists