lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1e06161bf49a3a88c4ea2e7a406815be56114c4f.camel@linaro.org>
Date: Mon, 21 Jul 2025 13:04:53 +0100
From: André Draszik <andre.draszik@...aro.org>
To: Neil Armstrong <neil.armstrong@...aro.org>, Alim Akhtar	
 <alim.akhtar@...sung.com>, Avri Altman <avri.altman@....com>, Bart Van
 Assche	 <bvanassche@....org>, "James E.J. Bottomley"	
 <James.Bottomley@...senPartnership.com>, "Martin K. Petersen"	
 <martin.petersen@...cle.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>, 
	linux-arm-msm@...r.kernel.org, linux-scsi@...r.kernel.org, 
	linux-kernel@...r.kernel.org, Peter Griffin <peter.griffin@...aro.org>,
 Will McVicker <willmcvicker@...gle.com>, kernel-team@...roid.com, Tudor
 Ambarus <tudor.ambarus@...aro.org>
Subject: Re: [PATCH RFT v3 3/3] ufs: core: delegate the interrupt service
 routine to a threaded irq handler

Hi,

On Mon, 2025-04-07 at 12:17 +0200, Neil Armstrong wrote:
> On systems with a large number request slots and unavailable MCQ ESI,
> the current design of the interrupt handler can delay handling of
> other subsystems interrupts causing display artifacts, GPU stalls
> or system firmware requests timeouts.
> 
> Since the interrupt routine can take quite some time, it's
> preferable to move it to a threaded handler and leave the
> hard interrupt handler wake up the threaded interrupt routine,
> the interrupt line would be masked until the processing is
> finished in the thread thanks to the IRQS_ONESHOT flag.
> 
> When MCQ & ESI interrupts are enabled the I/O completions are now
> directly handled in the "hard" interrupt routine to keep IOPs high
> since queues handling is done in separate per-queue interrupt routines.

This patch adversely affects Pixel 6 UFS performance. It has a
UFSHCI v3.x controller I believe (and therefore probably all
devices with < v4) - if my limited understanding is correct,
MCQ & ESI are a feature of v4 controllers only.

On Pixel 6, fio reports following performance on linux-next with
this patch:

read [1] / write [2]:
   READ: bw=17.1MiB/s (17.9MB/s), 17.1MiB/s-17.1MiB/s (17.9MB/s-17.9MB/s), io=684MiB (718MB), run=40001-40001msec
  WRITE: bw=20.6MiB/s (21.5MB/s), 20.6MiB/s-20.6MiB/s (21.5MB/s-21.5MB/s), io=822MiB (862MB), run=40003-40003msec

With this patch reverted, performance changes back to:

read [1] / write [2]:

   READ: bw=19.9MiB/s (20.8MB/s), 19.9MiB/s-19.9MiB/s (20.8MB/s-20.8MB/s), io=795MiB (833MB), run=40001-40001msec
  WRITE: bw=28.0MiB/s (29.4MB/s), 28.0MiB/s-28.0MiB/s (29.4MB/s-29.4MB/s), io=1122MiB (1176MB), run=40003-40003msec

all over multiple runs.

which is a ~26% reduction for write and ~14% reduction for read.

PCBenchmark even reports performance drops of ~41%.


I don't know much about UFS at this stage, but could the code simply
check for the controller version and revert to original behaviour
if < v4? Any thoughts on such a change?


[1]: fio --name=randread --rw=randread --ioengine=libaio --direct=1 \
         --bs=4k --numjobs=1 --size=1g --ramp_time=10 --runtime=40 --time_based \
         --end_fsync=1 --group_reporting --filename=/foo

[2]: fio --name=randwrite --rw=randwrite --ioengine=libaio --direct=1 \
         --bs=4k --numjobs=1 --size=1g --ramp_time=10 --runtime=40 --time_based \
         --end_fsync=1 --group_reporting --filename=/foo

Cheers,
Andre'

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ