[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b548158d-a155-bde9-caff-0f3fefbda403@acm.org>
Date: Thu, 7 Oct 2021 14:47:11 -0700
From: Bart Van Assche <bvanassche@....org>
To: Kiwoong Kim <kwmad.kim@...sung.com>, linux-scsi@...r.kernel.org,
linux-kernel@...r.kernel.org, alim.akhtar@...sung.com,
avri.altman@....com, jejb@...ux.ibm.com,
martin.petersen@...cle.com, beanhuo@...ron.com,
cang@...eaurora.org, adrian.hunter@...el.com, sc.suh@...sung.com,
hy50.seo@...sung.com, sh425.lee@...sung.com,
bhoon95.kim@...sung.com, vkumar.1997@...sung.com
Subject: Re: [PATCH v1] scsi: ufs: clear doorbell for hibern8 errors when
using ah8
On 10/7/21 12:40 AM, Kiwoong Kim wrote:
> When an scsi command is dispatched right after host complete
> all the pending requests goes to idle and ufs driver tries
> to ring a doorbell, host might be still during entering into
> hibern8. If an error occurrs during that period, the doorbell
> might not be zero. In this case, clearing it should be needed
> to requeue its command.
> Currently, ufshcd_err_handler goes directly to reset when the
> driver's link state is broken. This patch is to make it clear
> doorbells in the case. In the situation, communication would
> be disabled, so TM isn't necessary or we can say it's not
> available.
The above text is too hard to comprehend. Please make it more clear. As
an example, in the first sentence, what does "goes to idle" apply to?
Does it apply to "SCSI command", "pending requests" or something else?
What mechanism does "If an error occurs" refer to? A SCSI command
processing error, a link error or another type of error?
> Here's an actual symptom that I've faced. At the time, tag #17
> is still pended even after host reset. And then the block timer
> is expired.
pended -> pending.
> exynos-ufs 11100000.ufs: ufshcd_check_errors: Auto Hibern8
> Enter failed - status: 0x00000040, upmcrs: 0x00000001
> ..
> host_regs: 00000050: b8671000 00000008 00020000 00000000
> ..
> exynos-ufs 11100000.ufs: ufshcd_abort: Device abort task at tag 17
The relationship between the example and the description above the
example is not clear. If a command is pending before the error handler
starts, aborting that command fails and the host is not reset then the
command will still be pending after the error handler has finished.
> @@ -6138,7 +6139,12 @@ static void ufshcd_err_handler(struct Scsi_Host *host)
> (hba->saved_uic_err & (UFSHCD_UIC_DL_NAC_RECEIVED_ERROR |
> UFSHCD_UIC_DL_TCx_REPLAY_ERROR)))) {
> needs_reset = true;
> - goto do_reset;
> + spin_lock_irqsave(hba->host->host_lock, flags);
> + if (!hba->ahit && ufshcd_is_link_broken(hba))
> + link_broken_in_ah8 = true;
> + spin_unlock_irqrestore(hba->host->host_lock, flags);
> + if (!link_broken_in_ah8)
> + goto do_reset;
> }
>
My understanding is that hba->ahit == 0 means that auto-hibernation is
disabled. Hence, the above code sets 'link_broken_in_ah8' only if
auto-hibernation is disabled. So what does the name of the variable
'link_broken_in_ah8' mean?
> @@ -6168,6 +6174,9 @@ static void ufshcd_err_handler(struct Scsi_Host *host)
> }
> }
>
> + if (link_broken_in_ah8)
> + goto lock_skip_pending_xfer_clear;
> +
> /* Clear pending task management requests */
> for_each_set_bit(tag, &hba->outstanding_tasks, hba->nutmrs) {
> if (ufshcd_clear_tm_cmd(hba, tag)) {
Why is skipping the ufshcd_clear_tm_cmd() calls useful in this case?
Thanks,
Bart.
Powered by blists - more mailing lists