linux-kernel - Re: cfq performance gap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <457F583B.9090109@linux.vnet.ibm.com>
Date:	Tue, 12 Dec 2006 17:32:43 -0800
From:	"AVANTIKA R. MATHUR" <mathur@...ux.vnet.ibm.com>
To:	Jens Axboe <jens.axboe@...cle.com>
CC:	Avantika Mathur <mathur@...ibm.com>, linux-kernel@...r.kernel.org
Subject: Re: cfq performance gap

Jens Axboe wrote:
> On Fri, Dec 08 2006, Avantika Mathur wrote:
>   
>> On Fri, 2006-12-08 at 13:05 +0100, Jens Axboe wrote:
>>     
>>> On Thu, Dec 07 2006, Avantika Mathur wrote:
>>>       
>>>> Hi Jens,
>>>>         
>>> (you probably noticed now, but the axboe@...e.de email is no longer
>>> valid)
>>>       
>> I saw that, thanks!
>>     
>>>> I've noticed a performance gap between the cfq scheduler and other io
>>>> schedulers when running the rawio benchmark.
>>>>
>>>> The benchmark workload is 16 processes running 4k random reads.
>>>>
>>>> Is this performance gap a known issue?
>>>>         
>>> CFQ could be a little slower at this benchmark, but your results are
>>> much worse than I would expect. What is the queueing depth of sda? How
>>> are you invoking rawio?
>>>       
>> I am running rawio with the following options:
>> rawread -p 16 -m 1 -d 1 -x -z -t 0 -s 4096
>>
>> The queue depth on sda is 4.
>>     
>>> Your runtime is very low, how does it look if you allow the test to run
>>> for much longer? 30MiB/sec random read bandwidth seems very high, I'm
>>> wondering what exactly is being tested here.
>>>       
>> rawio is actually performing sequential reads, but I don't believe it is
>> purely sequential with the multiple processes.
>> I am currently running the test with longer runtimes and will post
>> results once it is complete.
>> I've also attached the rawio source.
>>     
>
> It's certainly the slice and idling hurting here. But at the same time,
> I don't really think your test case is very interesting. The test area
> is very small and you have 16 threads trying to read the same thing,
> optimizing for that would be silly as I don't think it has much real
> world relevance.
>
>   
Could a database have similar workload to this test?
> That said, I might add some logic to detect when we can cheaply switch
> queues instead of waiting for a new request from the same queue.
> Averaging slice times over a period of time instead of 1:1 with that
> logic, should help cases like this while still being fair.
>   
Thank you for looking at this issue.
I've found an IBM/SUSE bugzilla bug for the same performance gap on 
rawio. There was a fix for this bug included in SLES10-RC1, do you know 
why it was not added in mainline?

Thanks again,
Avantika Mathur
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/