[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0f480187-f0b8-482f-8b43-ed9bd454ec5b@quicinc.com>
Date: Tue, 29 Jul 2025 04:41:15 +0530
From: Nitin Rawat <quic_nitirawa@...cinc.com>
To: Neil Armstrong <neil.armstrong@...aro.org>,
Alim Akhtar
<alim.akhtar@...sung.com>,
Avri Altman <avri.altman@....com>,
Bart Van Assche
<bvanassche@....org>,
"James E.J. Bottomley"
<James.Bottomley@...senPartnership.com>,
"Martin K. Petersen"
<martin.petersen@...cle.com>
CC: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
<linux-arm-msm@...r.kernel.org>, <linux-scsi@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFT v3 3/3] ufs: core: delegate the interrupt service
routine to a threaded irq handler
On 4/7/2025 3:47 PM, Neil Armstrong wrote:
> On systems with a large number request slots and unavailable MCQ ESI,
> the current design of the interrupt handler can delay handling of
> other subsystems interrupts causing display artifacts, GPU stalls
> or system firmware requests timeouts.
>
> Since the interrupt routine can take quite some time, it's
> preferable to move it to a threaded handler and leave the
> hard interrupt handler wake up the threaded interrupt routine,
> the interrupt line would be masked until the processing is
> finished in the thread thanks to the IRQS_ONESHOT flag.
>
> When MCQ & ESI interrupts are enabled the I/O completions are now
> directly handled in the "hard" interrupt routine to keep IOPs high
> since queues handling is done in separate per-queue interrupt routines.
>
> This fixes all encountered issued when running FIO tests
> on the Qualcomm SM8650 platform.
Hi Neil,
I tested this change on both SM8750 and SM8650 using the upstream kernel
with MCQ enabled locally. I also validated it on the SM8750 downstream
codebase. In all cases, enabling MCQ mode led to boot-up issues on these
targets.
The root cause was that in MCQ mode, the Interrupt Status (IS) register
was not being cleared in the ufshcd_intr function. This resulted in
abnormal behavior during subsequent UIC commands, ultimately causing
boot failures.
To address this issue, I’ve submitted the following patch: [PATCH V1]
ufs: core: Fix interrupt handling for MCQ Mode in ufshcd_intr.
I also have plan to get the performance number with SDB and MCQ mode
with these change. I'll update the thread once i get the number.
Thanks,
Nitin
>
> Example of errors reported on a loaded system:
> [drm:dpu_encoder_frame_done_timeout:2706] [dpu error]enc32 frame done timeout
> msm_dpu ae01000.display-controller: [drm:hangcheck_handler [msm]] *ERROR* 67.5.20.1: hangcheck detected gpu lockup rb 2!
> msm_dpu ae01000.display-controller: [drm:hangcheck_handler [msm]] *ERROR* 67.5.20.1: completed fence: 74285
> msm_dpu ae01000.display-controller: [drm:hangcheck_handler [msm]] *ERROR* 67.5.20.1: submitted fence: 74286
> Error sending AMC RPMH requests (-110)
>
> Signed-off-by: Neil Armstrong <neil.armstrong@...aro.org>
> ---
> drivers/ufs/core/ufshcd.c | 30 +++++++++++++++++++++++++++---
> 1 file changed, 27 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 7f256f77b8ba9853569157db7785d177b6cd6dee..b40660ca2fa6b3488645bd26121752554a8d6a08 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -6971,7 +6971,7 @@ static irqreturn_t ufshcd_sl_intr(struct ufs_hba *hba, u32 intr_status)
> }
>
> /**
> - * ufshcd_intr - Main interrupt service routine
> + * ufshcd_threaded_intr - Threaded interrupt service routine
> * @irq: irq number
> * @__hba: pointer to adapter instance
> *
> @@ -6979,7 +6979,7 @@ static irqreturn_t ufshcd_sl_intr(struct ufs_hba *hba, u32 intr_status)
> * IRQ_HANDLED - If interrupt is valid
> * IRQ_NONE - If invalid interrupt
> */
> -static irqreturn_t ufshcd_intr(int irq, void *__hba)
> +static irqreturn_t ufshcd_threaded_intr(int irq, void *__hba)
> {
> u32 last_intr_status, intr_status, enabled_intr_status = 0;
> irqreturn_t retval = IRQ_NONE;
> @@ -7018,6 +7018,29 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba)
> return retval;
> }
>
> +/**
> + * ufshcd_intr - Main interrupt service routine
> + * @irq: irq number
> + * @__hba: pointer to adapter instance
> + *
> + * Return:
> + * IRQ_HANDLED - If interrupt is valid
> + * IRQ_WAKE_THREAD - If handling is moved to threaded handled
> + * IRQ_NONE - If invalid interrupt
> + */
> +static irqreturn_t ufshcd_intr(int irq, void *__hba)
> +{
> + struct ufs_hba *hba = __hba;
> +
> + /* Move interrupt handling to thread when MCQ & ESI are not enabled */
> + if (!hba->mcq_enabled || !hba->mcq_esi_enabled)
> + return IRQ_WAKE_THREAD;
> +
> + /* Directly handle interrupts since MCQ ESI handlers does the hard job */
> + return ufshcd_sl_intr(hba, ufshcd_readl(hba, REG_INTERRUPT_STATUS) &
> + ufshcd_readl(hba, REG_INTERRUPT_ENABLE));
> +}
> +
> static int ufshcd_clear_tm_cmd(struct ufs_hba *hba, int tag)
> {
> int err = 0;
> @@ -10577,7 +10600,8 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq)
> ufshcd_readl(hba, REG_INTERRUPT_ENABLE);
>
> /* IRQ registration */
> - err = devm_request_irq(dev, irq, ufshcd_intr, IRQF_SHARED, UFSHCD, hba);
> + err = devm_request_threaded_irq(dev, irq, ufshcd_intr, ufshcd_threaded_intr,
> + IRQF_ONESHOT | IRQF_SHARED, UFSHCD, hba);
> if (err) {
> dev_err(hba->dev, "request irq failed\n");
> goto out_disable;
>
Powered by blists - more mailing lists