lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 12 Jun 2021 14:46:20 +0800
From:   Can Guo <cang@...eaurora.org>
To:     Bart Van Assche <bvanassche@....org>
Cc:     Adrian Hunter <adrian.hunter@...el.com>, asutoshd@...eaurora.org,
        nguyenb@...eaurora.org, hongwus@...eaurora.org,
        ziqichen@...eaurora.org, linux-scsi@...r.kernel.org,
        kernel-team@...roid.com, Alim Akhtar <alim.akhtar@...sung.com>,
        Avri Altman <avri.altman@....com>,
        "James E.J. Bottomley" <jejb@...ux.ibm.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Stanley Chu <stanley.chu@...iatek.com>,
        Bean Huo <beanhuo@...ron.com>,
        Jaegeuk Kim <jaegeuk@...nel.org>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 5/9] scsi: ufs: Simplify error handling preparation

On 2021-06-12 04:58, Bart Van Assche wrote:
> On 6/10/21 8:01 PM, Can Guo wrote:
>> Previously, without commit cb7e6f05fce67c965194ac04467e1ba7bc70b069,
>> ufshcd_resume() may turn off pwr and clk due to UFS error, e.g., link
>> transition failure and SSU error/abort (and these UFS error would
>> invoke error handling).  When error handling kicks start, it should
>> re-enable the pwr and clk before proceeding. Now, commit
>> cb7e6f05fce67c965194ac04467e1ba7bc70b069 makes ufshcd_resume()
>> purely control pwr and clk, meaning if ufshcd_resume() fails, there
>> is nothing we can do about it - pwr or clk enabling must have failed,
>> and it is not because of UFS error. This is why I am removing the
>> re-enabling pwr/clk in error handling prepare.
> 
> Why are link transition failures handled in the error handler instead 
> of
> in the context where these errors are detected (ufshcd_resume())? Is it
> even possible to recover from a link transition failure or does this
> perhaps indicate a broken UFS controller?

Basically, almost all UFS failures are caused by errors in underlaying 
layers,
i.e., UIC errors, including link transition failures. And according to 
UFSHCI
spec, SW should do a full reset to recover it, just like handle any 
other
fatal UIC errors. All UIC errors are detected by HW and reported by IRQ 
handler.

UFSHCI Spec Ver. 31
8.2.7 Hibernate Enter/Exit Error Handling
Hibernate Enter/Exit Error occurs when the UniPro link is broken. When 
this condition occurs,
host software should reset the host controller by setting register HCE 
to ‘0’, re-initialize the host
controller by setting register HCE to ‘1', and then start link startup 
sequence as shown in Figure 16.

> 
>>> but what I really wonder is why we don't just do recovery directly
>>> in __ufshcd_wl_suspend() and  __ufshcd_wl_resume() and strip all
>>> the PM complexity out of ufshcd_err_handling()?
> 
> +1

I've explained why I chose not to do this in my last reply to Adrian.
Please kindly check it.

> 
>> For system suspend/resume, since error handling has the same nature
>> like user access, so we are using host_sem to avoid concurrency of
>> error handling and system suspend/resume.
> 
> Why is host_sem used for that purpose instead of lock_system_sleep() 
> and
> unlock_system_sleep()?
> 

I was aware of it, but the situation is that host_sem is also used to
avoid concurrency among user access, error handling and shutdown, so
I think just use host_sem anyways to simply the lockings, otherwise
user access and error handling would have to take both 
system_transition_mutex
and host_sem

Thanks,

Can Guo.

> Thanks,
> 
> Bart.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ