linux-kernel - Re: performance "regression" in cfq compared to anticipatory, deadline and noop

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6278d2220808250839j1dc25c02uda7bf8b6b150acb7@mail.gmail.com>
Date:	Mon, 25 Aug 2008 16:39:01 +0100
From:	"Daniel J Blueman" <daniel.blueman@...il.com>
To:	"Fabio Checconi" <fchecconi@...il.com>
Cc:	"Jens Axboe" <jens.axboe@...cle.com>,
	Matthew <jackdachef@...il.com>,
	"Kasper Sandberg" <lkml@...anurb.dk>,
	"Linux Kernel" <linux-kernel@...r.kernel.org>
Subject: Re: performance "regression" in cfq compared to anticipatory, deadline and noop

On Mon, Aug 25, 2008 at 9:29 PM, Fabio Checconi <fchecconi@...il.com> wrote:
> Hi,
>
>> From: Daniel J Blueman <daniel.blueman@...il.com>
>> Date: Sun, Aug 24, 2008 09:24:37PM +0100
>>
>> Hi Fabio, Jens,
>>
> ...
>> This was the last test I didn't get around to. Alas, is did help, but
>> didn't give the merging required for full performance:
>>
>> # echo 1 >/proc/sys/vm/drop_caches; dd if=/dev/sda of=/dev/null
>> bs=128k count=2000
>> 262144000 bytes (262 MB) copied, 2.47787 s, 106 MB/s
>>
>> # echo 1 >/proc/sys/vm/drop_caches; hdparm -t /dev/sda
>> Timing buffered disk reads:  308 MB in  3.01 seconds = 102.46 MB/sec
>>
>> It is an improvement over the baseline performance of 2.6.27-rc4:
>>
>> # echo 1 >/proc/sys/vm/drop_caches; dd if=/dev/sda of=/dev/null
>> bs=128k count=2000
>> 262144000 bytes (262 MB) copied, 2.56514 s, 102 MB/s
>>
>> # echo 1 >/proc/sys/vm/drop_caches; hdparm -t /dev/sda
>> Timing buffered disk reads:  294 MB in  3.02 seconds =  97.33 MB/sec
>>
>> Note that platter speed is around 125MB/s (which I get near at smaller
>> read sizes).
>>
>> I feel 128KB read requests are perhaps important, as this is a
>> commonly-used RAID stripe size, and may explain the read-performance
>> drop sometimes we see in hardware vs software RAID benchmarks.
>>
>> How can we generate some ideas or movement on fixing/improving this behaviour?
>>
>
> Thank you for testing.  The blktrace output for this run should be
> interesting, esp. to compare it with a blktrace obtained from anticipatory
> with the same workload - IIRC anticipatory didn't suffer from the problem,
> and anticipatory has a slightly different dispatching mechanism that
> this patch tried to bring into cfq.
>
> Even if a proper fix may not belong to the elevator itself, I think
> that this couple (this last test + anticipatory) of traces should help
> in better understanding what is still going wrong.
>
> Thank you in advance.

See http://quora.org/blktrace-n.tar.bz2

Where n is:
 0 - 2.6.27-rc4 unpatched
 1 - 2.6.27-rc4 with your CFQ patch, CFQ scheduler
 2 - 2.6.27-rc4 with your CFQ patch, anticipatory scheduler
 3 - 2.6.27-rc4 with your CFQ patch, deadline scheduler

I have found it's not always possible to reproduce this issue, eg now,
with stock CFQ, I'm seeing consistent 117-123MB/s with hdparm and dd
(as above), whereas I was seeing a consistent 95-103MB/s, so the
blktraces may not show the slower-performance pattern - even with
precisely the same (controlled) environment.

Thanks,
  Daniel
-- 
Daniel J Blueman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/