lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Thu, 7 Mar 2024 00:16:07 +0530
From: Harshit Mogalapalli <harshit.m.mogalapalli@...cle.com>
To: zhiguo.niu@...soc.com, bvanassche@....org, Jens Axboe <axboe@...nel.dk>,
        linux-block@...r.kernel.org
Cc: LKML <linux-kernel@...r.kernel.org>,
        Ramanan Govindarajan <ramanan.govindarajan@...cle.com>,
        Paul Webb <paul.x.webb@...cle.com>, nicky.veitch@...cle.com
Subject: [bug-report] Performance regression with fio sequential-write on a
 multipath setup.

Hi,

We have noticed a performance regression in kernel with fio sequential 
write job.

Notes and observations:
======================
1. This is observed on recent kernels(6.6) when compared with 5.15.y, 
the bisection points to commit d47f9717e5cf ("block/mq-deadline: use 
correct way to throttling write requests")
2. Reverting the above commit improves the performance.
3. This regression can also be seen on 6.8-rc7 and a revert on top of 
that fixes the regression.
4. The commit looks very much related to the cause of regression.
5. Note that this happens only with multi-path setup even with 2 block 
devices.

Test details:
============
(A) fio.write job

fio-3.19 -- fio version

[global]
ioengine=libaio
rw=write
bs=128k
iodepth=64
numjobs=24
direct=1
fsync=1
runtime=600
group_reporting

[job]
filename=/dev/dm-0
[job]
filename=/dev/dm-1

Each disk is of 600G size.

(B) Test results

6.8-rc7: 2 block devices with multi-path
-------

job: (g=0): rw=write, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 
128KiB-128KiB, ioengine=libaio, iodepth=64
..
job: (g=0): rw=write, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 
128KiB-128KiB, ioengine=libaio, iodepth=64
..
fio-3.19
Starting 48 processes

job: (groupid=0, jobs=48): err= 0: pid=6164: Wed Mar  6 17:58:33 2024
   write: IOPS=1884, BW=236MiB/s (247MB/s)(138GiB/600319msec); 0 zone resets
     slat (usec): min=2, max=540462, avg=25445.35, stdev=24181.85
     clat (msec): min=9, max=4941, avg=1602.56, stdev=339.05
      lat (msec): min=9, max=4973, avg=1628.00, stdev=342.19
     clat percentiles (msec):
      |  1.00th=[  986],  5.00th=[ 1167], 10.00th=[ 1250], 20.00th=[ 1368],
      | 30.00th=[ 1435], 40.00th=[ 1502], 50.00th=[ 1569], 60.00th=[ 1636],
      | 70.00th=[ 1703], 80.00th=[ 1804], 90.00th=[ 1955], 95.00th=[ 2140],
      | 99.00th=[ 2869], 99.50th=[ 3239], 99.90th=[ 3842], 99.95th=[ 4010],
      | 99.99th=[ 4329]
    bw (  KiB/s): min=47229, max=516492, per=100.00%, avg=241546.47, 
stdev=1326.92, samples=57259
    iops        : min=  322, max= 3996, avg=1843.17, stdev=10.39, 
samples=57259
   lat (msec)   : 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01%, 250=0.02%
   lat (msec)   : 500=0.06%, 750=0.14%, 1000=0.93%, 2000=90.41%, 
 >=2000=8.42%
   fsync/fdatasync/sync_file_range:
     sync (nsec): min=10, max=57940, avg=104.23, stdev=498.86
     sync percentiles (nsec):
      |  1.00th=[   13],  5.00th=[   19], 10.00th=[   26], 20.00th=[   61],
      | 30.00th=[   68], 40.00th=[   72], 50.00th=[   75], 60.00th=[   78],
      | 70.00th=[   87], 80.00th=[  167], 90.00th=[  175], 95.00th=[  177],
      | 99.00th=[  221], 99.50th=[  231], 99.90th=[  318], 99.95th=[15680],
      | 99.99th=[17792]
   cpu          : usr=0.08%, sys=0.16%, ctx=1096948, majf=0, minf=1995
   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, 
 >=64=199.5%
      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
 >=64=0.0%
      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, 
 >=64=0.0%
      issued rwts: total=0,1131018,0,1127994 short=0,0,0,0 dropped=0,0,0,0
      latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   WRITE: bw=236MiB/s (247MB/s), 236MiB/s-236MiB/s (247MB/s-247MB/s), 
io=138GiB (148GB), run=600319-600319msec

Disk stats (read/write):
     dm-0: ios=50/533034, merge=0/27056, ticks=16/113070163, 
in_queue=113070180, util=100.00%, aggrios=43/266595, aggrmerge=0/0, 
aggrticks=156/56542549, aggrin_queue=56542706, aggrutil=100.00%
   sdac: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
   sde: ios=86/533191, merge=0/0, ticks=313/113085099, 
in_queue=113085413, util=100.00%
     dm-1: ios=5/534381, merge=0/36389, ticks=240/113110344, 
in_queue=113110584, util=100.00%, aggrios=7/267191, aggrmerge=0/0, 
aggrticks=153/56543654, aggrin_queue=56543807, aggrutil=100.00%
   sdf: ios=14/534382, merge=0/0, ticks=306/113087308, 
in_queue=113087615, util=100.00%
   sdad: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

Throughput Results:
WRITE:247:1884:0


6.8-rc7+ Revert : 2 block devices with multi-path
-------

job: (g=0): rw=write, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 
128KiB-128KiB, ioengine=libaio, iodepth=64
..
job: (g=0): rw=write, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 
128KiB-128KiB, ioengine=libaio, iodepth=64
..
fio-3.19
Starting 48 processes

job: (groupid=0, jobs=48): err= 0: pid=6104: Wed Mar  6 18:29:13 2024
   write: IOPS=2518, BW=315MiB/s (330MB/s)(185GiB/600339msec); 0 zone resets
     slat (usec): min=2, max=923472, avg=6789.22, stdev=20329.20
     clat (msec): min=4, max=6020, avg=1212.68, stdev=714.90
      lat (msec): min=4, max=6020, avg=1219.47, stdev=718.40
     clat percentiles (msec):
      |  1.00th=[  203],  5.00th=[  309], 10.00th=[  384], 20.00th=[  535],
      | 30.00th=[  709], 40.00th=[  911], 50.00th=[ 1133], 60.00th=[ 1334],
      | 70.00th=[ 1519], 80.00th=[ 1754], 90.00th=[ 2198], 95.00th=[ 2601],
      | 99.00th=[ 3171], 99.50th=[ 3608], 99.90th=[ 4329], 99.95th=[ 4597],
      | 99.99th=[ 5134]
    bw (  KiB/s): min=12237, max=1834896, per=100.00%, avg=413187.52, 
stdev=6322.04, samples=44948
    iops        : min=   48, max=14314, avg=3186.68, stdev=49.49, 
samples=44948
   lat (msec)   : 10=0.01%, 20=0.01%, 50=0.09%, 100=0.02%, 250=2.28%
   lat (msec)   : 500=15.45%, 750=14.26%, 1000=11.83%, 2000=42.52%, 
 >=2000=13.55%
   fsync/fdatasync/sync_file_range:
     sync (nsec): min=10, max=76066, avg=57.85, stdev=299.52
     sync percentiles (nsec):
      |  1.00th=[   13],  5.00th=[   14], 10.00th=[   15], 20.00th=[   16],
      | 30.00th=[   17], 40.00th=[   20], 50.00th=[   28], 60.00th=[   47],
      | 70.00th=[   65], 80.00th=[   80], 90.00th=[  103], 95.00th=[  175],
      | 99.00th=[  237], 99.50th=[  241], 99.90th=[  262], 99.95th=[  318],
      | 99.99th=[16512]
   cpu          : usr=0.06%, sys=0.07%, ctx=531434, majf=0, minf=728
   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, 
 >=64=199.6%
      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
 >=64=0.0%
      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, 
 >=64=0.0%
      issued rwts: total=0,1511918,0,1508894 short=0,0,0,0 dropped=0,0,0,0
      latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   WRITE: bw=315MiB/s (330MB/s), 315MiB/s-315MiB/s (330MB/s-330MB/s), 
io=185GiB (198GB), run=600339-600339msec

Disk stats (read/write):
     dm-0: ios=0/246318, merge=0/493981, ticks=0/142584585, 
in_queue=142584586, util=99.17%, aggrios=6/181454, aggrmerge=0/0, 
aggrticks=112/70608689, aggrin_queue=70608801, aggrutil=84.92%
   sdac: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
   sde: ios=12/362908, merge=0/0, ticks=224/141217379, 
in_queue=141217603, util=84.92%
     dm-1: ios=0/233211, merge=0/538097, ticks=0/142579042, 
in_queue=142579043, util=99.15%, aggrios=8/174475, aggrmerge=0/0, 
aggrticks=128/70654686, aggrin_queue=70654814, aggrutil=85.20%
   sdf: ios=16/348951, merge=0/0, ticks=256/141309372, 
in_queue=141309628, util=85.20%
   sdad: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

Throughput Results:
WRITE:330:2518:0

(C) performance difference:

That is roughly a 33.65% performance change, this is reproducible on 
higher number of block devices as well.



Thanks to Paul Webb for identifying this regression and sharing the details.
We will be happy to test any patches to check the change in performance 
and also follow any suggestions.


Thanks,
Harshit

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ