linux-kernel - Re: [PATCH v4 06/10] scsi: ufs: Remove host

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ddf72aae-6a0a-06e3-daf8-84b922d7eb52@acm.org>
Date:   Tue, 29 Jun 2021 11:01:52 -0700
From:   Bart Van Assche <bvanassche@....org>
To:     Can Guo <cang@...eaurora.org>
Cc:     Adrian Hunter <adrian.hunter@...el.com>, asutoshd@...eaurora.org,
        nguyenb@...eaurora.org, hongwus@...eaurora.org,
        ziqichen@...eaurora.org, linux-scsi@...r.kernel.org,
        kernel-team@...roid.com, Alim Akhtar <alim.akhtar@...sung.com>,
        Avri Altman <avri.altman@....com>,
        "James E.J. Bottomley" <jejb@...ux.ibm.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Stanley Chu <stanley.chu@...iatek.com>,
        Bean Huo <beanhuo@...ron.com>,
        Jaegeuk Kim <jaegeuk@...nel.org>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 06/10] scsi: ufs: Remove host_sem used in
 suspend/resume

On 6/28/21 11:23 PM, Can Guo wrote:
> On 2021-06-29 01:31, Bart Van Assche wrote:
>> On 6/28/21 1:17 AM, Can Guo wrote:
>>> On 2021-06-25 01:11, Bart Van Assche wrote:
>>>> On 6/23/21 11:31 PM, Can Guo wrote:
>>>>> Using back host_sem in suspend_prepare()/resume_complete()
>>>>> won't have this problem of deadlock, right?
>>>> 
>>>> Although that would solve the deadlock discussed in this email 
>>>> thread, it wouldn't solve the issue of potential adverse
>>>> interactions of the UFS error handler and the SCSI error
>>>> handler running concurrently.
>>> 
>>> I think I've explained it before, paste it here -
>>> 
>>> ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and 
>>> flushes it, so SCSI error handler and UFS error handler can
>>> safely run together.
>> 
>> That code path is the exception. Do you agree that the following
>> three functions all invoke the ufshcd_err_handler() function
>> asynchronously? * ufshcd_uic_pwr_ctrl() * ufshcd_check_errors() *
>> ufshcd_abort()
> 
> I agree, but I don't see what's wrong with that. Any context can
> invoke ufs error handler asynchronously and ufs error handler prepare
> makes sure error handler can work safely, i.e., stopping PM
> ops/gating/scaling in error handler prepare makes sure no one shall
> call ufshcd_uic_pwr_ctrl() ever again. And ufshcd_check_errors() and
> ufshcd_abort() are OK to run concurrently with UFS error handler.

The current UFS error handling approach requires the following code in
ufshcd_queuecommand():

		if (hba->pm_op_in_progress) {
			hba->force_reset = true;
			set_host_byte(cmd, DID_BAD_TARGET);
			cmd->scsi_done(cmd);
			goto out;
		}

Removing that code is not possible with the current error handling
approach. My patch makes it possible to remove that code.

> Sorry that I missed the change of scsi_transport_template() in your 
> previous message. I can understand that you want to invoke UFS error
> hander by invoking SCSI error handler, but I didn't go that far
> because I saw you changed pm_runtime_get_sync() to
> pm_runtime_get_noresume() in ufs error handler prepare. How can that
> change make sure that the device is not suspending or resuming while
> error handler is running?

UFS power state transitions happen by submitting a SCSI command to a
WLUN. The SCSI error handler is only activated after all outstanding
SCSI commands for a SCSI host have failed or completed. I think this
guarantees for the UFS driver that eh_strategy_handler is not invoked
while a command submitted to a WLUN is changing the power state of the
UFS device. The following code from scsi_error.c only wakes up the error
handler if (shost->host_failed || shost->host_eh_scheduled) &&
shost->host_failed == scsi_host_busy(shost):

	if ((shost->host_failed == 0 && shost->host_eh_scheduled == 0)
	    || shost->host_failed != scsi_host_busy(shost)) {
		schedule();
		continue;
	}
	/* Handle SCSI errors */

Thanks,

Bart.