[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <690bbcfe-d6db-f6d1-acea-8ee5aa4ac606@quicinc.com>
Date: Wed, 8 Mar 2023 17:35:54 -0800
From: "Bao D. Nguyen" <quic_nguyenb@...cinc.com>
To: Bart Van Assche <bvanassche@....org>, <quic_asutoshd@...cinc.com>,
<quic_cang@...cinc.com>, <mani@...nel.org>,
<stanley.chu@...iatek.com>, <adrian.hunter@...el.com>,
<beanhuo@...ron.com>, <avri.altman@....com>,
<martin.petersen@...cle.com>
CC: <linux-scsi@...r.kernel.org>,
Alim Akhtar <alim.akhtar@...sung.com>,
"James E.J. Bottomley" <jejb@...ux.ibm.com>,
Arthur Simchaev <Arthur.Simchaev@....com>,
open list <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH v1 4/4] ufs: mcq: Added ufshcd_mcq_abort()
On 3/8/2023 3:25 PM, Bart Van Assche wrote:
> On 3/8/23 14:37, Bao D. Nguyen wrote:
>> On 3/8/2023 11:02 AM, Bart Van Assche wrote:
>>> On 3/7/23 20:01, Bao D. Nguyen wrote:
>>>> + if (ufshcd_mcq_cqe_search(hba, hwq, tag)) {
>>>> + dev_err(hba->dev, "%s: cmd found in cq. hwq=%d, tag=%d\n",
>>>> + __func__, hwq->id, tag);
>>>> + /*
>>>> + * The command should not be 'stuck' in the CQ for such a
>>>> long time.
>>>> + * Is interrupt missing? Process the CQEs here. If the
>>>> interrupt is
>>>> + * invoked at a later time, the CQ will be empty because
>>>> the CQEs
>>>> + * are already processed here.
>>>> + */
>>>> + ufshcd_mcq_poll_cqe_lock(hba, hwq);
>>>> + err = SUCCESS;
>>>> + goto out;
>>>> + }
>>>
>>> Please remove the above code and also the definition of the
>>> ufshcd_mcq_cqe_search() function. The SCSI error handler submits an
>>> abort to deal with command processing timeouts.
>>> ufshcd_mcq_cqe_search() can only return true in case of a software
>>> bug at the host side. Addressing such bugs is out of scope for the
>>> SCSI error handler.
>>
>> This is an attempt to handle the error case similar to SDB mode where
>> it prints "%s: cmd was completed, but without a notifying intr, tag =
>> %d" in the ufshcd_abort() function.
>>
>> In this case the command has been completed by the hardware, but some
>> reasons the software has not processed it. We have seen this print
>> happened during debug sessions, so the error case does happen in SBL
>> mode.
>>
>> Are you suggesting we should return error in this case without
>> calling ufshcd_mcq_poll_cqe_lock()?
>
> What I am asking is to remove ufshcd_mcq_poll_cqe_lock() and all code
> that depends on that function returning true. Although such code might
> be useful for SoC debugging, helping with SoC debugging is out of
> scope for Linux kernel drivers.
I will remove it. In that case, we don't need the first patch of this
series, so I will remove the first patch as well. Thanks.
>
> Thanks,
>
> Bart.
>
Powered by blists - more mailing lists