linux-kernel - Re: [PATCH RFT v3 3/3] ufs: core: delegate the interrupt service routine to a threaded irq handler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0f480187-f0b8-482f-8b43-ed9bd454ec5b@quicinc.com>
Date: Tue, 29 Jul 2025 04:41:15 +0530
From: Nitin Rawat <quic_nitirawa@...cinc.com>
To: Neil Armstrong <neil.armstrong@...aro.org>,
        Alim Akhtar
	<alim.akhtar@...sung.com>,
        Avri Altman <avri.altman@....com>,
        Bart Van Assche
	<bvanassche@....org>,
        "James E.J. Bottomley"
	<James.Bottomley@...senPartnership.com>,
        "Martin K. Petersen"
	<martin.petersen@...cle.com>
CC: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
        <linux-arm-msm@...r.kernel.org>, <linux-scsi@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFT v3 3/3] ufs: core: delegate the interrupt service
 routine to a threaded irq handler



On 4/7/2025 3:47 PM, Neil Armstrong wrote:
> On systems with a large number request slots and unavailable MCQ ESI,
> the current design of the interrupt handler can delay handling of
> other subsystems interrupts causing display artifacts, GPU stalls
> or system firmware requests timeouts.
> 
> Since the interrupt routine can take quite some time, it's
> preferable to move it to a threaded handler and leave the
> hard interrupt handler wake up the threaded interrupt routine,
> the interrupt line would be masked until the processing is
> finished in the thread thanks to the IRQS_ONESHOT flag.
> 
> When MCQ & ESI interrupts are enabled the I/O completions are now
> directly handled in the "hard" interrupt routine to keep IOPs high
> since queues handling is done in separate per-queue interrupt routines.
> 
> This fixes all encountered issued when running FIO tests
> on the Qualcomm SM8650 platform.

Hi Neil,

I tested this change on both SM8750 and SM8650 using the upstream kernel 
with MCQ enabled locally. I also validated it on the SM8750 downstream 
codebase. In all cases, enabling MCQ mode led to boot-up issues on these 
targets.

The root cause was that in MCQ mode, the Interrupt Status (IS) register 
was not being cleared in the ufshcd_intr function. This resulted in 
abnormal behavior during subsequent UIC commands, ultimately causing 
boot failures.

To address this issue, I’ve submitted the following patch: [PATCH V1] 
ufs: core: Fix interrupt handling for MCQ Mode in ufshcd_intr.

I also have plan to get the performance number with SDB and MCQ mode 
with these change. I'll update the thread once i get the number.

Thanks,
Nitin

> 
> Example of errors reported on a loaded system:
>   [drm:dpu_encoder_frame_done_timeout:2706] [dpu error]enc32 frame done timeout
>   msm_dpu ae01000.display-controller: [drm:hangcheck_handler [msm]] *ERROR* 67.5.20.1: hangcheck detected gpu lockup rb 2!
>   msm_dpu ae01000.display-controller: [drm:hangcheck_handler [msm]] *ERROR* 67.5.20.1:     completed fence: 74285
>   msm_dpu ae01000.display-controller: [drm:hangcheck_handler [msm]] *ERROR* 67.5.20.1:     submitted fence: 74286
>   Error sending AMC RPMH requests (-110)
> 
> Signed-off-by: Neil Armstrong <neil.armstrong@...aro.org>
> ---
>   drivers/ufs/core/ufshcd.c | 30 +++++++++++++++++++++++++++---
>   1 file changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 7f256f77b8ba9853569157db7785d177b6cd6dee..b40660ca2fa6b3488645bd26121752554a8d6a08 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -6971,7 +6971,7 @@ static irqreturn_t ufshcd_sl_intr(struct ufs_hba *hba, u32 intr_status)
>   }
>   
>   /**
> - * ufshcd_intr - Main interrupt service routine
> + * ufshcd_threaded_intr - Threaded interrupt service routine
>    * @irq: irq number
>    * @__hba: pointer to adapter instance
>    *
> @@ -6979,7 +6979,7 @@ static irqreturn_t ufshcd_sl_intr(struct ufs_hba *hba, u32 intr_status)
>    *  IRQ_HANDLED - If interrupt is valid
>    *  IRQ_NONE    - If invalid interrupt
>    */
> -static irqreturn_t ufshcd_intr(int irq, void *__hba)
> +static irqreturn_t ufshcd_threaded_intr(int irq, void *__hba)
>   {
>   	u32 last_intr_status, intr_status, enabled_intr_status = 0;
>   	irqreturn_t retval = IRQ_NONE;
> @@ -7018,6 +7018,29 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba)
>   	return retval;
>   }
>   
> +/**
> + * ufshcd_intr - Main interrupt service routine
> + * @irq: irq number
> + * @__hba: pointer to adapter instance
> + *
> + * Return:
> + *  IRQ_HANDLED     - If interrupt is valid
> + *  IRQ_WAKE_THREAD - If handling is moved to threaded handled
> + *  IRQ_NONE        - If invalid interrupt
> + */
> +static irqreturn_t ufshcd_intr(int irq, void *__hba)
> +{
> +	struct ufs_hba *hba = __hba;
> +
> +	/* Move interrupt handling to thread when MCQ & ESI are not enabled */
> +	if (!hba->mcq_enabled || !hba->mcq_esi_enabled)
> +		return IRQ_WAKE_THREAD;
> +
> +	/* Directly handle interrupts since MCQ ESI handlers does the hard job */
> +	return ufshcd_sl_intr(hba, ufshcd_readl(hba, REG_INTERRUPT_STATUS) &
> +				   ufshcd_readl(hba, REG_INTERRUPT_ENABLE));
> +}
> +
>   static int ufshcd_clear_tm_cmd(struct ufs_hba *hba, int tag)
>   {
>   	int err = 0;
> @@ -10577,7 +10600,8 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq)
>   	ufshcd_readl(hba, REG_INTERRUPT_ENABLE);
>   
>   	/* IRQ registration */
> -	err = devm_request_irq(dev, irq, ufshcd_intr, IRQF_SHARED, UFSHCD, hba);
> +	err = devm_request_threaded_irq(dev, irq, ufshcd_intr, ufshcd_threaded_intr,
> +					IRQF_ONESHOT | IRQF_SHARED, UFSHCD, hba);
>   	if (err) {
>   		dev_err(hba->dev, "request irq failed\n");
>   		goto out_disable;
>