[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0bb28ed3-8b4f-77f3-5648-adb42604f37e@acm.org>
Date: Thu, 4 May 2023 11:04:58 -0700
From: Bart Van Assche <bvanassche@....org>
To: "Bao D. Nguyen" <quic_nguyenb@...cinc.com>,
quic_asutoshd@...cinc.com, quic_cang@...cinc.com, mani@...nel.org,
Powen.Kao@...iatek.com, stanley.chu@...iatek.com,
adrian.hunter@...el.com, beanhuo@...ron.com, avri.altman@....com,
martin.petersen@...cle.com
Cc: linux-scsi@...r.kernel.org, Alim Akhtar <alim.akhtar@...sung.com>,
"James E.J. Bottomley" <jejb@...ux.ibm.com>,
open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 5/5] ufs: core: Add error handling for MCQ mode
On 5/3/23 21:18, Bao D. Nguyen wrote:
> On 4/25/2023 5:21 PM, Bart Van Assche wrote:
>> On 4/17/23 14:05, Bao D. Nguyen wrote:
>>> + /* MCQ mode */
>>> + if (is_mcq_enabled(hba))
>>> + return ufshcd_clear_cmds(hba, 1UL << lrbp->task_tag);
>>
>> The above code will trigger an overflow if lrbp->task_tag >= 8 * sizeof(unsigned long). That's not acceptable.
> This ufshcd_clear_cmds() uses a bit map. There are multiple places in the UFS code have this limitation if the queue depth grows to be greater than 64. I am thinking:
> 1. Current ufs controllers in the market probably support queue depth 64 or less, so it may not be a problem today if host controller cap is set to 64 queue depth, but can be a problem in multiple places in the code later.
> 2. In mcq mode, we can pass a tag number into this api ufshcd_clear_cmds(); while in SDB mode, pass the tag's bit mask as before.
> 3. Use sbitmask() to support large queue depth? Thanks for any suggestions.
The UFS driver is the only block driver I know that tracks which commands
are pending in a bitmap. Please pass the lrbp pointer or the task_tag directly
to ufshcd_clear_cmds() instead of passing a bitmap to that function. Please
also introduce a loop in ufshcd_eh_device_reset_handler() around the
ufshcd_clear_cmds() call instead of passing a bitmap to ufshcd_clear_cmds().
>>> static irqreturn_t ufshcd_transfer_req_compl(struct ufs_hba *hba)
>>> {
>>> + struct ufshcd_lrb *lrbp;
>>> + u32 hwq_num, utag;
>>> + int tag;
>>> +
>>> /* Resetting interrupt aggregation counters first and reading the
>>> * DOOR_BELL afterward allows us to handle all the completed requests.
>>> * In order to prevent other interrupts starvation the DB is read once
>>> @@ -5580,7 +5590,22 @@ static irqreturn_t ufshcd_transfer_req_compl(struct ufs_hba *hba)
>>> * Ignore the ufshcd_poll() return value and return IRQ_HANDLED since we
>>> * do not want polling to trigger spurious interrupt complaints.
>>> */
>>> - ufshcd_poll(hba->host, UFSHCD_POLL_FROM_INTERRUPT_CONTEXT);
>>> + if (!is_mcq_enabled(hba)) {
>>> + ufshcd_poll(hba->host, UFSHCD_POLL_FROM_INTERRUPT_CONTEXT);
>>> + goto out;
>>> + }
>>> +
>>> + /* MCQ mode */
>>> + for (tag = 0; tag < hba->nutrs; tag++) {
>>> + lrbp = &hba->lrb[tag];
>>> + if (lrbp->cmd) {
>>> + utag = blk_mq_unique_tag(scsi_cmd_to_rq(lrbp->cmd));
>>> + hwq_num = blk_mq_unique_tag_to_hwq(utag);
>>> + ufshcd_poll(hba->host, hwq_num);
>>> + }
>>> + }
>>
>> Is my understanding correct that ufshcd_transfer_req_compl() is only called from single doorbell code paths and hence that the above change is not necessary?
> ufshcd_transfer_req_compl() can be invoked from MCQ mode such as the ufshcd_err_handler() as below:
> ufshcd_err_handler()-->ufshcd_complete_requests()-->ufshcd_transfer_req_compl()
Since there are multiple statements in ufshcd_transfer_req_compl() that assume
SDB mode (resetting SDB interrupt aggregation and calling ufshcd_poll()), please
move the is_mcq_enabled() test from ufshcd_transfer_req_compl() into the
callers of that function.
Thanks,
Bart.
Powered by blists - more mailing lists