linux-kernel - Re: [PATCH 2/2] cfq: allow dispatching of both sync and async I/O together

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 22 Jun 2010 15:21:18 +0200
From:	Jens Axboe <axboe@...nel.dk>
To:	Vivek Goyal <vgoyal@...hat.com>
CC:	Jeff Moyer <jmoyer@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] cfq: allow dispatching of both sync and async I/O
  together

On 2010-06-22 15:18, Vivek Goyal wrote:
> On Tue, Jun 22, 2010 at 08:45:54AM -0400, Jeff Moyer wrote:
>> Vivek Goyal <vgoyal@...hat.com> writes:
>>
>>> On Mon, Jun 21, 2010 at 07:22:08PM -0400, Vivek Goyal wrote:
>>>> On Mon, Jun 21, 2010 at 09:59:48PM +0200, Jens Axboe wrote:
>>>>> On 21/06/10 21.49, Jeff Moyer wrote:
>>>>>> Hi,
>>>>>>
>>>>>> In testing a workload that has a single fsync-ing process and another
>>>>>> process that does a sequential buffered read, I was unable to tune CFQ
>>>>>> to reach the throughput of deadline.  This patch, along with the previous
>>>>>> one, brought CFQ in line with deadline when setting slice_idle to 0.
>>>>>>
>>>>>> I'm not sure what the original reason for not allowing sync and async
>>>>>> I/O to be dispatched together was.  If there is a workload I should be
>>>>>> testing that shows the inherent problems of this, please point me at it
>>>>>> and I will resume testing.  Until and unless that workload is identified,
>>>>>> please consider applying this patch.
>>>>>
>>>>> The problematic case is/was a normal SATA drive with a buffered
>>>>> writer and an occasional reader. I'll have to double check my
>>>>> mail tomorrow, but iirc the issue was that the occasional reader
>>>>> would suffer great latencies since service times for that single
>>>>> IO would be delayed at the drive side. It could perhaps just be
>>>>> a bug in how we handle the slice idling on the read side when the
>>>>> IO gets delayed initially.
>>>>>
>>>>> So if my memory is correct, google for the fsync madness and
>>>>> interactiveness thread that we had some months ago and which
>>>>> caused a lot of tweaking. The commit adding this is
>>>>> 5ad531db6e0f3c3c985666e83d3c1c4d53acccf9 and was added back
>>>>> in July last year. So it was around that time that the mails went
>>>>> around.
>>>>
>>>> Hi Jens,
>>>>
>>>> I suspect we might have introduced this patch because mike galbraith
>>>> had issues which application interactiveness (reading data back from swap)
>>>> in the prence of heavy writeout on SATA disk.
>>>>
>>>> After this patch we did two enhancements.
>>>>
>>>> - You introduced the logic of building write queue depth gradually.
>>>> - Corrado introduced the logic of idling on the random reader service
>>>>   tree.
>>>>
>>>> In the past random reader were not protected from WRITES as there was no
>>>> idling on random readers. But with corrado's changes of idling on
>>>> sync-noidle service tree, I think this problem might have been solved to
>>>> a great extent.
>>>>
>>>> Getting rid of this exclusivity of either SYNC/ASYNC requests in request
>>>> queue might help us with throughput on storage arrys without loosing
>>>> protection for random reader on SATA. 
>>>>
>>>> I will do some testing with and without patch and see if above is true
>>>> or not.
>>>
>>> Some primilinary testing results with and without patch. I started a
>>> buffered writer and started firefox and monitored how much time firefox
>>> took.
>>>
>>> dd if=/dev/zero of=zerofile bs=4K count=1024M
>>>
>>> 2.6.35-rc3 vanilla
>>> ==================
>>> real    0m22.546s
>>> user    0m0.566s
>>> sys     0m0.107s
>>>
>>>
>>> real    0m21.410s
>>> user    0m0.527s
>>> sys     0m0.095s
>>>
>>>
>>> real    0m27.594s
>>> user    0m1.256s
>>> sys     0m0.483s
>>>
>>> 2.6.35-rc3 + jeff's patches
>>> ===========================
>>> real    0m20.372s
>>> user    0m0.635s
>>> sys     0m0.128s
>>>
>>> real    0m22.281s
>>> user    0m0.509s
>>> sys     0m0.093s
>>>
>>> real    0m23.211s
>>> user    0m0.674s
>>> sys     0m0.140s
>>>
>>> So looks like firefox launching times have not changed much in the presence
>>> of heavy buffered writting going on root disk. I will do more testing tomorrow.
>>
>> Was the buffered writer actually hitting disk?  How much memory is on
>> your system?
> 
> I have 4G of memory in the system. I used to wait for 10-15 seconds after
> writer has started and then launch firefox to make sure writes are actually
> hitting the disk.
> 
> Are you seeing different results in your testing?

Just to be sure, this is a regular SATA drive that has NCQ enabled and
running? Apart from that comment, the test sounds good - dirty lots of
memory and ensure that it's writing, then start the reader. Should be
worst case for the reader. Sadly, both the before and after timings
are pretty horrible :-/

-- 
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/