[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <76af97e49cb7f36c8dc6edc62c84e72d6bb4669c.camel@linaro.org>
Date: Mon, 28 Jul 2025 15:49:01 +0100
From: André Draszik <andre.draszik@...aro.org>
To: Neil Armstrong <neil.armstrong@...aro.org>, Alim Akhtar
<alim.akhtar@...sung.com>, Avri Altman <avri.altman@....com>, Bart Van
Assche <bvanassche@....org>, "James E.J. Bottomley"
<James.Bottomley@...senPartnership.com>, "Martin K. Petersen"
<martin.petersen@...cle.com>
Cc: Peter Griffin <peter.griffin@...aro.org>, Tudor Ambarus
<tudor.ambarus@...aro.org>, Will McVicker <willmcvicker@...gle.com>,
Manivannan Sadhasivam <mani@...nel.org>, kernel-team@...roid.com,
linux-arm-msm@...r.kernel.org, linux-samsung-soc@...r.kernel.org,
linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Subject: Re: [PATCH v2 2/2] scsi: ufs: core: move some irq handling back to
hardirq (with time limit)
On Mon, 2025-07-28 at 16:43 +0200, Neil Armstrong wrote:
> On 25/07/2025 16:16, André Draszik wrote:
> > Commit 3c7ac40d7322 ("scsi: ufs: core: Delegate the interrupt service
> > routine to a threaded IRQ handler") introduced a massive performance
> > drop for various work loads on UFSHC versions < 4 due to the extra
> > latency introduced by moving all of the IRQ handling into a threaded
> > handler. See below for a summary.
> >
> > To resolve this performance drop, move IRQ handling back into hardirq
> > context, but apply a time limit which, once expired, will cause the
> > remainder of the work to be deferred to the threaded handler.
> >
> > Above commit is trying to avoid unduly delay of other subsystem
> > interrupts while the UFS events are being handled. By limiting the
> > amount of time spent in hardirq context, we can still ensure that.
> >
> > The time limit itself was chosen because I have generally seen
> > interrupt handling to have been completed within 20 usecs, with the
> > occasional spikes of a couple 100 usecs.
> >
> > This commits brings UFS performance roughly back to original
> > performance, and should still avoid other subsystem's starvation thanks
> > to dealing with these spikes.
> >
> > fio results for 4k block size on Pixel 6, all values being the average
> > of 5 runs each:
> > read / 1 job original after this commit
> > min IOPS 4,653.60 2,704.40 3,902.80
> > max IOPS 6,151.80 4,847.60 6,103.40
> > avg IOPS 5,488.82 4,226.61 5,314.89
> > cpu % usr 1.85 1.72 1.97
> > cpu % sys 32.46 28.88 33.29
> > bw MB/s 21.46 16.50 20.76
> >
> > read / 8 jobs original after this commit
> > min IOPS 18,207.80 11,323.00 17,911.80
> > max IOPS 25,535.80 14,477.40 24,373.60
> > avg IOPS 22,529.93 13,325.59 21,868.85
> > cpu % usr 1.70 1.41 1.67
> > cpu % sys 27.89 21.85 27.23
> > bw MB/s 88.10 52.10 84.48
> >
> > write / 1 job original after this commit
> > min IOPS 6,524.20 3,136.00 5,988.40
> > max IOPS 7,303.60 5,144.40 7,232.40
> > avg IOPS 7,169.80 4,608.29 7,014.66
> > cpu % usr 2.29 2.34 2.23
> > cpu % sys 41.91 39.34 42.48
> > bw MB/s 28.02 18.00 27.42
> >
> > write / 8 jobs original after this commit
> > min IOPS 12,685.40 13,783.00 12,622.40
> > max IOPS 30,814.20 22,122.00 29,636.00
> > avg IOPS 21,539.04 18,552.63 21,134.65
> > cpu % usr 2.08 1.61 2.07
> > cpu % sys 30.86 23.88 30.64
> > bw MB/s 84.18 72.54 82.62
>
> Thanks for this updated change, I'm running the exact same run on SM8650 to check the impact,
> and I'll report something comparable.
Btw, my complete command was (should probably have added that
to the commit message in the first place):
for rw in read write ; do
echo "rw: ${rw}"
for jobs in 1 8 ; do
echo "jobs: ${jobs}"
for it in $(seq 1 5) ; do
fio --name=rand${rw} --rw=rand${rw} \
--ioengine=libaio --direct=1 \
--bs=4k --numjobs=${jobs} --size=32m \
--runtime=30 --time_based --end_fsync=1 \
--group_reporting --filename=/foo \
| grep -E '(iops|sys=|READ:|WRITE:)'
sleep 5
done
done
done
Cheers,
Andre'
Powered by blists - more mailing lists