linux-kernel - Re: CFQ read performance regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Sat, 17 Apr 2010 14:46:17 +0200
From:	Corrado Zoccolo <czoccolo@...il.com>
To:	Miklos Szeredi <mszeredi@...e.cz>
Cc:	Jens Axboe <jens.axboe@...cle.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Jan Kara <jack@...e.cz>, Suresh Jayaraman <sjayaraman@...e.de>
Subject: Re: CFQ read performance regression

Hi Miklos,
I don't think this is related to CFQ. I've made a graph of the
accessed (read) sectors (see attached).
You can see that the green cloud (2.6.16) is much more concentrated,
while the red one (2.6.32) is split in two, and you can better
recognize the different lines.
This means that the FS put more distance between the blocks of the
files written by the tio threads, and the read time is therefore
impacted, since the disk head has to perform longer seeks. On the
other hand, if you read those files sequentially with a single thread,
the performance may be better with the new layout, so YMMV. When
testing 2.6.32 and up, you should consider testing also with
low_latency setting disabled, since tuning for latency can negatively
affect throughput.

Thanks,
Corrado

On Fri, Apr 16, 2010 at 2:27 PM, Miklos Szeredi <mszeredi@...e.cz> wrote:
> Hi Jens,
>
> I'm chasing a performance bottleneck identified by tiobench that seems
> to be caused by CFQ.  On a SLES10-SP3 kernel (2.6.16, with some patches
> moving cfq closer to 2.6.17) tiobench with 8 threads gets about 260MB/s
> sequential read throughput.  On a recent kernels (including vanilla
> 2.6.34-rc) it makes about 145MB/s, a regression of 45%.  The queue and
> readahead parameters are the same.
>
> This goes back some time, 2.6.27 already seems to have a bad
> performance.
>
> Changing the scheduler to noop will increase the throughput back into
> the 260MB/s range.  So this is not a driver issue.
>
> Also increasing quantum *and* readahead will increase the throughput,
> but not by as much.  Both noop and these tweaks decrease the write
> throughput somewhat however...
>
> Apparently on recent kernels the number of dispatched requests stays
> mostly at or below 4 and the dispatched sector count at or below 2000,
> which is not enough to fill the bandwidth on this setup.
>
> On 2.6.16 the number of dispatched requests hovers around 22 and the
> sector count around 16000.
>
> I uploaded blktraces for the read part of the tiobench runs for both
> 2.6.16 and 2.6.32:
>
>  http://www.kernel.org/pub/linux/kernel/people/mszeredi/blktrace/
>
> Do you have any idea about the cause of this regression?
>
> Thanks,
> Miklos
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>



-- 
__________________________________________________________________________

dott. Corrado Zoccolo                          mailto:czoccolo@...il.com
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------
The self-confidence of a warrior is not the self-confidence of the average
man. The average man seeks certainty in the eyes of the onlooker and calls
that self-confidence. The warrior seeks impeccability in his own eyes and
calls that humbleness.
                               Tales of Power - C. Castaneda

Download attachment "access.png" of type "image/png" (12253 bytes)