[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <386c2e650232d7a900f5c1bbf98bd5a5@codeaurora.org>
Date: Wed, 23 Jun 2021 09:34:42 +0800
From: Can Guo <cang@...eaurora.org>
To: Bart Van Assche <bvanassche@....org>
Cc: asutoshd@...eaurora.org, nguyenb@...eaurora.org,
hongwus@...eaurora.org, ziqichen@...eaurora.org,
linux-scsi@...r.kernel.org, kernel-team@...roid.com,
Alim Akhtar <alim.akhtar@...sung.com>,
Avri Altman <avri.altman@....com>,
"James E.J. Bottomley" <jejb@...ux.ibm.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Stanley Chu <stanley.chu@...iatek.com>,
Bean Huo <beanhuo@...ron.com>,
Jaegeuk Kim <jaegeuk@...nel.org>,
open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 8/9] scsi: ufs: Update the fast abort path in
ufshcd_abort() for PM requests
Hi Bart,
On 2021-06-17 01:55, Bart Van Assche wrote:
> On 6/16/21 1:47 AM, Can Guo wrote:
>> On 2021-06-16 12:40, Bart Van Assche wrote:
>>> On 6/15/21 9:00 PM, Can Guo wrote:
>>>> 2. And say we want SCSI layer to resubmit PM requests to prevent
>>>> suspend/resume fail, we should keep retrying the PM requests (so
>>>> long as error handler can recover everything successfully),
>>>> meaning we should give them unlimited retries (which I think is a
>>>> bad idea), otherwise (if they have zero retries or limited
>>>> retries), in extreme conditions, what may happen is that error
>>>> handler can recover everything successfully every time, but all
>>>> these retries (say 3) still time out, which block the power
>>>> management for too long (retries * 60 seconds) and, most
>>>> important, when the last retry times out, scsi layer will
>>>> anyways complete the PM request (even we return DID_IMM_RETRY),
>>>> then we end up same - suspend/resume shall run concurrently with
>>>> error handler and we couldn't recover saved PM errors.
>>>
>>> Hmm ... it is not clear to me why this behavior is considered a
>>> problem?
>>
>> To me, task abort to PM requests does not worth being treated so
>> differently, after all suspend/resume may fail due to any kinds of
>> UFS errors (as I've explained so many times). My idea is to let PM
>> requests fast fail (60 seconds has passed, a broken device maybe, we
>> have reason to fail it since it is just a passthrough req) and
>> schedule UFS error handler, UFS error handler shall proceed after
>> suspend/resume fails out then start to recover everything in a safe
>> environment. Is this way not working?
> Hi Can,
>
> Thank you for the clarification. As you probably know the power
> management subsystem serializes runtime power management (RPM) and
> system suspend callbacks. I was concerned about the consequences of a
> failed RPM transition on system suspend and resume. Having taken a
> closer look at the UFS driver, I see that failed RPM transitions do not
> require special handling in the system suspend or resume callbacks. In
> other words, I'm fine with the approach of failing PM requests fast.
>
Thank you for your time and efforts spent on this series, I will upload
next version to address your previous comments (hope I can convince
Trilok
to pick these up).
Thanks,
Can Guo.
> Bart.
Powered by blists - more mailing lists