[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4034AD9F-2A6A-4AE6-B5FC-58FC2BC238F5@michaelmarod.com>
Date: Thu, 31 Mar 2022 23:22:03 +0000
From: Michael Marod <michael@...haelmarod.com>
To: Christoph Hellwig <hch@...radead.org>
Cc: linux-kernel@...r.kernel.org, linux-block@...r.kernel.org
Subject: Re: NVME performance regression in Linux 5.x due to lack of block level IO queueing
Good call -- Turns out that that cache issue is resolved in 5.17. I tried a number of kernels and narrowed it down to a problem that started after 4.9 and before 4.15, and ended some time after 5.13. Namely, 4.9 is good, 4.15 is bad, 5.13 is bad, and 5.17 is good. I did not bisect it all the way down to the specific versions where the behaviors changed.
Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
nvme1n1 2758.00 2783.00 11032.00 11132.00 0.00 0.00 0.00 0.00 0.10 0.03 0.36 4.00 4.00 0.18 100.00
nvme0n1 2830.00 2875.00 11320.00 11500.00 0.00 0.00 0.00 0.00 0.10 0.03 0.39 4.00 4.00 0.18 100.00
With regards to the performance between 4.4.0 and 5.17, for a single thread, 4.4.0 still had better performance over 5.17. However, the 5.17 kernel was significantly better at multiple threads. In fact, it is so much better I don't believe the results (10x improvement!). Is this to be expected that a single thread would be slower in 5.17, but recent improvements make it possible to run many of them in parallel more efficiently?
# /usr/local/bin/fio -name=randrw -filename=/opt/foo -direct=1 -iodepth=1 -thread -rw=randrw -ioengine=psync -bs=4k -size=10G -numjobs=16 -group_reporting=1 -runtime=120
// Ubuntu 16.04 / Linux 4.4.0:
Run status group 0 (all jobs):
READ: bw=54.5MiB/s (57.1MB/s), 54.5MiB/s-54.5MiB/s (57.1MB/s-57.1MB/s), io=6537MiB (6854MB), run=120002-120002msec
WRITE: bw=54.5MiB/s (57.2MB/s), 54.5MiB/s-54.5MiB/s (57.2MB/s-57.2MB/s), io=6544MiB (6862MB), run=120002-120002msec
// Ubuntu 18.04 / Linux 5.4.0:
Run status group 0 (all jobs):
READ: bw=23.5MiB/s (24.7MB/s), 23.5MiB/s-23.5MiB/s (24.7MB/s-24.7MB/s), io=2821MiB (2959MB), run=120002-120002msec
WRITE: bw=23.5MiB/s (24.6MB/s), 23.5MiB/s-23.5MiB/s (24.6MB/s-24.6MB/s), io=2819MiB (2955MB), run=120002-120002msec
// Ubuntu 18.04 / Linux 5.17:
Run status group 0 (all jobs):
READ: bw=244MiB/s (255MB/s), 244MiB/s-244MiB/s (255MB/s-255MB/s), io=28.6GiB (30.7GB), run=120001-120001msec
WRITE: bw=244MiB/s (256MB/s), 244MiB/s-244MiB/s (256MB/s-256MB/s), io=28.6GiB (30.7GB), run=120001-120001msec
Thanks,
Michael
Powered by blists - more mailing lists