[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ecf749fc88220910563704ef41939d40@codeaurora.org>
Date: Wed, 30 Jun 2021 05:50:46 +0800
From: Can Guo <cang@...eaurora.org>
To: Bart Van Assche <bvanassche@....org>
Cc: Adrian Hunter <adrian.hunter@...el.com>, asutoshd@...eaurora.org,
nguyenb@...eaurora.org, hongwus@...eaurora.org,
ziqichen@...eaurora.org, linux-scsi@...r.kernel.org,
kernel-team@...roid.com, Alim Akhtar <alim.akhtar@...sung.com>,
Avri Altman <avri.altman@....com>,
"James E.J. Bottomley" <jejb@...ux.ibm.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Stanley Chu <stanley.chu@...iatek.com>,
Bean Huo <beanhuo@...ron.com>,
Jaegeuk Kim <jaegeuk@...nel.org>,
open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 06/10] scsi: ufs: Remove host_sem used in
suspend/resume
On 2021-06-30 02:01, Bart Van Assche wrote:
> On 6/28/21 11:23 PM, Can Guo wrote:
>> On 2021-06-29 01:31, Bart Van Assche wrote:
>>> On 6/28/21 1:17 AM, Can Guo wrote:
>>>> On 2021-06-25 01:11, Bart Van Assche wrote:
>>>>> On 6/23/21 11:31 PM, Can Guo wrote:
>>>>>> Using back host_sem in suspend_prepare()/resume_complete()
>>>>>> won't have this problem of deadlock, right?
>>>>>
>>>>> Although that would solve the deadlock discussed in this email
>>>>> thread, it wouldn't solve the issue of potential adverse
>>>>> interactions of the UFS error handler and the SCSI error
>>>>> handler running concurrently.
>>>>
>>>> I think I've explained it before, paste it here -
>>>>
>>>> ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and
>>>> flushes it, so SCSI error handler and UFS error handler can
>>>> safely run together.
>>>
>>> That code path is the exception. Do you agree that the following
>>> three functions all invoke the ufshcd_err_handler() function
>>> asynchronously? * ufshcd_uic_pwr_ctrl() * ufshcd_check_errors() *
>>> ufshcd_abort()
>>
>> I agree, but I don't see what's wrong with that. Any context can
>> invoke ufs error handler asynchronously and ufs error handler prepare
>> makes sure error handler can work safely, i.e., stopping PM
>> ops/gating/scaling in error handler prepare makes sure no one shall
>> call ufshcd_uic_pwr_ctrl() ever again. And ufshcd_check_errors() and
>> ufshcd_abort() are OK to run concurrently with UFS error handler.
>
> The current UFS error handling approach requires the following code in
> ufshcd_queuecommand():
>
> if (hba->pm_op_in_progress) {
> hba->force_reset = true;
> set_host_byte(cmd, DID_BAD_TARGET);
> cmd->scsi_done(cmd);
> goto out;
> }
>
> Removing that code is not possible with the current error handling
> approach. My patch makes it possible to remove that code.
>
>> Sorry that I missed the change of scsi_transport_template() in your
>> previous message. I can understand that you want to invoke UFS error
>> hander by invoking SCSI error handler, but I didn't go that far
>> because I saw you changed pm_runtime_get_sync() to
>> pm_runtime_get_noresume() in ufs error handler prepare. How can that
>> change make sure that the device is not suspending or resuming while
>> error handler is running?
>
> UFS power state transitions happen by submitting a SCSI command to a
> WLUN. The SCSI error handler is only activated after all outstanding
> SCSI commands for a SCSI host have failed or completed. I think this
> guarantees for the UFS driver that eh_strategy_handler is not invoked
> while a command submitted to a WLUN is changing the power state of the
> UFS device. The following code from scsi_error.c only wakes up the
> error
> handler if (shost->host_failed || shost->host_eh_scheduled) &&
> shost->host_failed == scsi_host_busy(shost):
>
> if ((shost->host_failed == 0 && shost->host_eh_scheduled == 0)
> || shost->host_failed != scsi_host_busy(shost)) {
> schedule();
> continue;
> }
> /* Handle SCSI errors */
>
It is not completely right - wl_suspend/resume() are much more twisted.
wl_suspend() may or may NOT send a SCSI cmd to WLUN, i.e., SSU cmd may
be skipped if spm/rpm_lvl is 0/1 and/or if bkops/wb is on-going (even
when
rpm_lvl is not 0/1), while link can still be put to hibern8/off, then
power/clks can still be shutdown to save power.
wl_resume(), in case of rpm/spm_lvl == 5, does a full reset to UFS
device,
without sending a SSU cmd to WLU to complete the power state transition.
So above checks (in scsi_error_handler()) cannot gaurantee that actual
power state transistions in UFS driver has ceased before start UFS error
handling.
Thanks,
Can Guo.
> Thanks,
>
> Bart.
Powered by blists - more mailing lists