linux-kernel - Re: [PATCH v2 2/2] scsi: ufs: core: move some irq handling back to hardirq (with time limit)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <88d31a258feb36425ad73d0323077972f85f8341.camel@linaro.org>
Date: Tue, 29 Jul 2025 12:20:42 +0100
From: André Draszik <andre.draszik@...aro.org>
To: Neil Armstrong <neil.armstrong@...aro.org>, Bart Van Assche
	 <bvanassche@....org>, Alim Akhtar <alim.akhtar@...sung.com>, Avri Altman
	 <avri.altman@....com>, "James E.J. Bottomley"
	 <James.Bottomley@...senPartnership.com>, "Martin K. Petersen"
	 <martin.petersen@...cle.com>
Cc: Peter Griffin <peter.griffin@...aro.org>, Tudor Ambarus	
 <tudor.ambarus@...aro.org>, Will McVicker <willmcvicker@...gle.com>, 
 Manivannan Sadhasivam	 <mani@...nel.org>, kernel-team@...roid.com,
 linux-arm-msm@...r.kernel.org, 	linux-samsung-soc@...r.kernel.org,
 linux-scsi@...r.kernel.org, 	linux-kernel@...r.kernel.org,
 stable@...r.kernel.org
Subject: Re: [PATCH v2 2/2] scsi: ufs: core: move some irq handling back to
 hardirq (with time limit)

On Mon, 2025-07-28 at 18:55 +0200, Neil Armstrong wrote:
> Hi,
> 
> On 28/07/2025 17:19, Bart Van Assche wrote:
> > On 7/28/25 7:49 AM, André Draszik wrote:
> > > Btw, my complete command was (should probably have added that
> > > to the commit message in the first place):
> > > 
> > > for rw in read write ; do
> > >      echo "rw: ${rw}"
> > >      for jobs in 1 8 ; do
> > >          echo "jobs: ${jobs}"
> > >          for it in $(seq 1 5) ; do
> > >              fio --name=rand${rw} --rw=rand${rw} \
> > >                  --ioengine=libaio --direct=1 \
> > >                  --bs=4k --numjobs=${jobs} --size=32m \
> > >                  --runtime=30 --time_based --end_fsync=1 \
> > >                  --group_reporting --filename=/foo \
> > >              | grep -E '(iops|sys=|READ:|WRITE:)'
> > >              sleep 5
> > >          done
> > >      done
> > > done
> > 
> > Please run performance tests in recovery mode against a block
> > device (/dev/block/sd...) instead of running performance tests on
> > top of a filesystem. One possible approach for retrieving the block
> > device name is as follows:
> > 
> > adb shell readlink /dev/block/by-name/userdata
> > 
> > There may be other approaches for retrieving the name of the block
> > device associated with /data. Additionally, tuning for maximum
> > performance is useful because it eliminates impact from the process
> > scheduler on block device performance measurement. An extract from a
> > scrip that I use myself to measure block device performance on Pixel
> > devices is available below.
> 
> Of course, I did all that and ran on the SM8650 QRD & HDK boards, one has
> an UFS 3.1 device and the other an UFS 4.0 device.
> 
> Here's the raw data:
> 
> Board: sm8650-qrd
> read / 1 job
>                      v6.15               v6.16                   v6.16 + this commit
> min IOPS            3,996.00            5,921.60                           3,424.80
> max IOPS            4,772.80            6,491.20                           4,541.20
> avg IOPS            4,526.25            6,295.31                           4,320.58
> cpu % usr               4.62                2.96                               4.50
> cpu % sys              21.45               17.88                              21.62
> bw MB/s                18.54               25.78                              17.64
> 
> read / 8 job
>                      v6.15               v6.16                   v6.16 + this commit
> min IOPS           51,867.60           51,575.40                          45,257.00
> max IOPS           67,513.60           64,456.40                          56,336.00
> avg IOPS           64,314.80           62,136.76                          52,505.72
> cpu % usr               3.98                3.72                               3.52
> cpu % sys              16.70               17.16                              18.74
> bw MB/s               263.60              254.40                             215.00
> 
> write / 1 job
>                      v6.15               v6.16                   v6.16 + this commit
> min IOPS            5,654.80            8,060.00                           5,730.80
> max IOPS            6,720.40            8,852.00                           6,981.20
> avg IOPS            6,576.91            8,579.81                           6,726.51
> cpu % usr               7.48                3.79                               8.49
> cpu % sys              41.09               23.27                              34.86
> bw MB/s                26.96               35.16                              27.52
> 
> write / 8 job
>                      v6.15               v6.16                   v6.16 + this commit
> min IOPS           84,687.80           95,043.40                          74,799.60
> max IOPS          107,620.80          113,572.00                          96,377.20
> avg IOPS           97,910.86          105,927.38                          87,239.07
> cpu % usr               5.43                4.38                               3.72
> cpu % sys              21.73               20.29                              30.97
> bw MB/s               400.80              433.80                             357.40
> 
> Board: sm8650-hdk
> read / 1 job
>                      v6.15               v6.16                   v6.16 + this commit
> min IOPS            4,867.20            5,596.80                           4,242.80
> max IOPS            5,211.60            5,970.00                           4,548.80
> avg IOPS            5,126.12            5,847.93                           4,370.14
> cpu % usr               3.83                2.81                               2.62
> cpu % sys              18.29               13.44                              16.89
> bw MB/s                20.98               17.88                              23.96
> 
> read / 8 job
>                      v6.15               v6.16                   v6.16 + this commit
> min IOPS           47,583.80           46,831.60                          47,671.20
> max IOPS           58,913.20           59,442.80                          56,282.80
> avg IOPS           53,609.04           44,396.88                          53,621.46
> cpu % usr               3.57                3.06                               3.11
> cpu % sys              15.23               19.31                              15.90
> bw MB/s               219.40              219.60                             210.80
> 
> write / 1 job
>                      v6.15               v6.16                   v6.16 + this commit
> min IOPS            6,529.42            8,367.20                           6,492.80
> max IOPS            7,856.92            9,244.40                           7,184.80
> avg IOPS            7,676.21            8,991.67                           6,904.67
> cpu % usr              10.17                7.98                               3.68
> cpu % sys              37.55               34.41                              23.07
> bw MB/s                31.44               28.28                              36.84
> 
> write / 8 job
>                      v6.15               v6.16                   v6.16 + this commit
> min IOPS           86,304.60           94,288.80                          78,433.60
> max IOPS          105,670.80          110,373.60                          96,330.80
> avg IOPS           97,418.81          103,789.76                          88,468.27
> cpu % usr               4.98                3.27                               3.67
> cpu % sys              21.45               30.85                              20.08
> bw MB/s               399.00              362.40                             425.00
> 
> Assisted analysis gives:
> 
> IOPS (Input/Output Operations Per Second):
> The v6.16 kernel shows a slight increase in average IOPS compared to v6.15 (43245.69 vs. 42144.88).
> The v6.16+fix kernel significantly reduces average IOPS, dropping to 36946.17.
> 
> Bandwidth (MB/s):
> The v6.16 kernel shows an increase in average bandwidth compared to v6.15 (180.72 MB/s vs. 172.59 MB/s).
> The v6.16 with this commit significantly reduces average bandwidth, dropping to 151.32 MB/s.
> 
> Detailed Analysis:
> Impact of v6.16 Kernel:
> The v6.16 kernel introduces a minor improvement in IO performance compared to v6.15.
> Both average IOPS and average bandwidth saw a small increase. This suggests that the v6.16
> kernel might have introduced some optimizations that slightly improved overall IO performance.
> 
> Impact of the Fix:
> The potential introduced appears to have a negative impact on both IOPS and bandwidth.
> Both metrics show a substantial decrease compared to both v6.15 and v6.16.
> This indicates that the fix might be detrimental to IO performance.
> 
> The threaded IRQ change did increase IOPS and Bandwidth, and stopped starving interrupts.
> This change gives worse numbers than before the threaded IRQ.

Thanks Neil for your numbers.

So there must be more to it... I was interested in overall performance
originally, and using block layer access and disabling all kernel debug
options, the absolute numbers of course change for me, but the general
trend is still the same:

fio results for 4k block size on Pixel 6, all values being the average
of 5 runs each:
    read / 1 job      original         after                this commit
      min IOPS        7,741.60     6,500.00 (-16.04%)     9,175.60 ( 18.52%)
      max IOPS       11,548.80     8,217.60 (-28.84%)    11,476.80 (- 0.62%)
      avg IOPS       10,356.69     7,143.21 (-31.03%)    11,098.65 (  7.16%)
      cpu % usr           4.31         4.93 ( 14.33%)         3.34 (-22.63%)
      cpu % sys          22.12        25.08 ( 13.40%)        18.14 (-17.97%)
      bw MB/s            40.46        27.92 (-30.99%)        43.34 (  7.12%)

    read / 8 jobs     original         after                this commit
      min IOPS       53,834.00    49,145.20 (- 8.71%)    52,202.40 (- 3.03%)
      max IOPS       62,489.20    55,378.00 (-11.38%)    61,207.20 (- 2.05%)
      avg IOPS       60,733.97    52,305.85 (-13.88%)    58,617.45 (- 3.48%)
      cpu % usr           5.59         4.24 (-24.22%)         5.31 (- 4.94%)
      cpu % sys          28.32        21.56 (-23.85%)        27.44 (- 3.08%)
      bw MB/s           237.40       204.40 (-13.90%)       228.80 (- 3.62%)

    write / 1 job     original         after                this commit
      min IOPS       11,438.00    10,173.60 (-11.05%)    16,418.40 ( 43.54%)
      max IOPS       19,752.00    13,366.80 (-32.33%)    19,666.00 (- 0.44%)
      avg IOPS       18,329.57    11,656.83 (-36.40%)    18,685.83 (  1.94%)
      cpu % usr           6.46         6.30 (- 2.60%)         5.73 (-11.29%)
      cpu % sys          33.74        31.34 (- 7.11%)        30.83 (- 8.63%)
      bw MB/s            71.60        45.52 (-36.42%)        73.72 (  2.96%)

    write / 8 jobs    original         after                this commit
      min IOPS       68,824.20    59,397.20 (-13.70%)    60,699.60 (-11.80%)
      max IOPS       89,382.00    68,318.60 (-23.57%)    90,247.00 (  0.97%)
      avg IOPS       79,589.76    65,048.45 (-18.27%)    75,100.38 (- 5.64%)
      cpu % usr           6.23         5.76 (- 7.48%)         5.49 (-11.89%)
      cpu % sys          31.76        27.99 (-11.85%)        28.29 (-10.93%)
      bw MB/s           311.00       253.80 (-18.39%)       293.20 (- 5.72%)

'Original' is next-20250708 with the culprit commit reverted, 'after'
is unmodified linux-next.

While in the meantime I did change 'this commit' to use ktime rather than
jiffies for time limit calculation, we can still see the huge effects of
the two changes.

Bandwidth and avg IOPS are down between 13^ and 36% with the culprit commit.

Cheers,
Andre'