lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fabc70f8-6bb8-4b62-3311-f6e0ce9eb2c3@acm.org>
Date:   Tue, 15 Jun 2021 11:25:35 -0700
From:   Bart Van Assche <bvanassche@....org>
To:     Can Guo <cang@...eaurora.org>
Cc:     asutoshd@...eaurora.org, nguyenb@...eaurora.org,
        hongwus@...eaurora.org, ziqichen@...eaurora.org,
        linux-scsi@...r.kernel.org, kernel-team@...roid.com,
        Alim Akhtar <alim.akhtar@...sung.com>,
        Avri Altman <avri.altman@....com>,
        "James E.J. Bottomley" <jejb@...ux.ibm.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Stanley Chu <stanley.chu@...iatek.com>,
        Bean Huo <beanhuo@...ron.com>,
        Jaegeuk Kim <jaegeuk@...nel.org>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 8/9] scsi: ufs: Update the fast abort path in
 ufshcd_abort() for PM requests

On 6/14/21 7:36 PM, Can Guo wrote:
> I've considered the similar way (leverage hba->host->eh_noresume) last
> year,
> but I didn't take this way due to below reasons:
> 
> 1. UFS error handler basically does one thing - reset and restore, which
> stops hba [1], resets device [2] and re-probes the device [3]. Stopping
> hba [1]
> shall complete any pending requests in the doorbell (with error or no
> error).
> After [1], suspend/resume contexts, blocked by SSU cmd, shall be unblocked
> right away to do whatever it needs to handle the SSU cmd failure (completed
> in [1], so scsi_execute() returns an error), e.g., put link back to the old
> state. call ufshcd_vops_suspend(), turn off irq/clocks/powers and etc...
> However, reset and restore ([2] and [3]) is still running, and it can
> (most likely)
> be disturbed by suspend/resume. So passing a parameter or using
> hba->host->eh_noresume
> to skip lock_system_sleep() and unlock_system_sleep() can break the cycle,
> but error handling may run concurrently with suspend/resume. Of course
> we can
> modify suspend/resume to avoid it, but I was pursuing a minimal change
> to get this fixed.
> 
> 2. Whatever way we take to break the cycle, suspend/resume shall fail and
> RPM framework shall save the error to dev.power.runtime_error, leaving
> the device in runtime suspended or active mode permanently. If it is left
> runtime suspended, UFS driver won't accept cmd anymore, while if it is left
> runtime active, powers of UFS device and host will be left ON, leading
> to power
> penalty. So my main idea is to let suspend/resume contexts, blocked by
> PM cmds,
> fail fast first and then error handler recover everything back to work.

Hi Can,

Has it been considered to make the UFS error handler fail pending
commands with an error code that causes the SCSI core to resubmit the
SCSI command, e.g. DID_IMM_RETRY or DID_TRANSPORT_DISRUPTED? I want to
prevent that power management or suspend/resume callbacks fail if the
error handler succeeds with recovering the UFS transport.

Thanks,

Bart.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ