linux-kernel - Re: [PATCH v3 8/9] scsi: ufs: Update the fast abort path in ufshcd

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <386c2e650232d7a900f5c1bbf98bd5a5@codeaurora.org>
Date:   Wed, 23 Jun 2021 09:34:42 +0800
From:   Can Guo <cang@...eaurora.org>
To:     Bart Van Assche <bvanassche@....org>
Cc:     asutoshd@...eaurora.org, nguyenb@...eaurora.org,
        hongwus@...eaurora.org, ziqichen@...eaurora.org,
        linux-scsi@...r.kernel.org, kernel-team@...roid.com,
        Alim Akhtar <alim.akhtar@...sung.com>,
        Avri Altman <avri.altman@....com>,
        "James E.J. Bottomley" <jejb@...ux.ibm.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Stanley Chu <stanley.chu@...iatek.com>,
        Bean Huo <beanhuo@...ron.com>,
        Jaegeuk Kim <jaegeuk@...nel.org>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 8/9] scsi: ufs: Update the fast abort path in
 ufshcd_abort() for PM requests

Hi Bart,

On 2021-06-17 01:55, Bart Van Assche wrote:
> On 6/16/21 1:47 AM, Can Guo wrote:
>> On 2021-06-16 12:40, Bart Van Assche wrote:
>>> On 6/15/21 9:00 PM, Can Guo wrote:
>>>> 2. And say we want SCSI layer to resubmit PM requests to prevent
>>>> suspend/resume fail, we should keep retrying the PM requests (so
>>>> long as error handler can recover everything successfully),
>>>> meaning we should give them unlimited retries (which I think is a
>>>> bad idea), otherwise (if they have zero retries or limited
>>>> retries), in extreme conditions, what may happen is that error
>>>> handler can recover everything successfully every time, but all
>>>> these retries (say 3) still time out, which block the power
>>>> management for too long (retries * 60 seconds) and, most
>>>> important, when the last retry times out, scsi layer will
>>>> anyways complete the PM request (even we return DID_IMM_RETRY),
>>>> then we end up same - suspend/resume shall run concurrently with
>>>> error handler and we couldn't recover saved PM errors.
>>> 
>>> Hmm ... it is not clear to me why this behavior is considered a
>>> problem?
>> 
>> To me, task abort to PM requests does not worth being treated so
>> differently, after all suspend/resume may fail due to any kinds of
>> UFS errors (as I've explained so many times). My idea is to let PM
>> requests fast fail (60 seconds has passed, a broken device maybe, we
>> have reason to fail it since it is just a passthrough req) and
>> schedule UFS error handler, UFS error handler shall proceed after
>> suspend/resume fails out then start to recover everything in a safe
>> environment. Is this way not working?
> Hi Can,
> 
> Thank you for the clarification. As you probably know the power
> management subsystem serializes runtime power management (RPM) and
> system suspend callbacks. I was concerned about the consequences of a
> failed RPM transition on system suspend and resume. Having taken a
> closer look at the UFS driver, I see that failed RPM transitions do not
> require special handling in the system suspend or resume callbacks. In
> other words, I'm fine with the approach of failing PM requests fast.
> 

Thank you for your time and efforts spent on this series, I will upload
next version to address your previous comments (hope I can convince 
Trilok
to pick these up).

Thanks,

Can Guo.

> Bart.