linux-kernel - Re: [PATCH v4 06/10] scsi: ufs: Remove host

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a87e5ca5-390f-8ca0-41bf-27cdc70e3316@intel.com>
Date:   Thu, 24 Jun 2021 08:52:24 +0300
From:   Adrian Hunter <adrian.hunter@...el.com>
To:     Can Guo <cang@...eaurora.org>
Cc:     asutoshd@...eaurora.org, nguyenb@...eaurora.org,
        hongwus@...eaurora.org, ziqichen@...eaurora.org,
        linux-scsi@...r.kernel.org, kernel-team@...roid.com,
        Alim Akhtar <alim.akhtar@...sung.com>,
        Avri Altman <avri.altman@....com>,
        "James E.J. Bottomley" <jejb@...ux.ibm.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Stanley Chu <stanley.chu@...iatek.com>,
        Bean Huo <beanhuo@...ron.com>,
        Jaegeuk Kim <jaegeuk@...nel.org>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 06/10] scsi: ufs: Remove host_sem used in
 suspend/resume

On 24/06/21 5:16 am, Can Guo wrote:
> On 2021-06-23 22:30, Adrian Hunter wrote:
>> On 23/06/21 10:35 am, Can Guo wrote:
>>> To protect system suspend/resume from being disturbed by error handling,
>>> instead of using host_sem, let error handler call lock_system_sleep() and
>>> unlock_system_sleep() which achieve the same purpose. Remove the host_sem
>>> used in suspend/resume paths to make the code more readable.
>>>
>>> Suggested-by: Bart Van Assche <bvanassche@....org>
>>> Signed-off-by: Can Guo <cang@...eaurora.org>
>>> ---
>>>  drivers/scsi/ufs/ufshcd.c | 12 +++++++-----
>>>  1 file changed, 7 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>>> index 3695dd2..a09e4a2 100644
>>> --- a/drivers/scsi/ufs/ufshcd.c
>>> +++ b/drivers/scsi/ufs/ufshcd.c
>>> @@ -5907,6 +5907,11 @@ static void ufshcd_clk_scaling_suspend(struct ufs_hba *hba, bool suspend)
>>>
>>>  static void ufshcd_err_handling_prepare(struct ufs_hba *hba)
>>>  {
>>> +    /*
>>> +     * It is not safe to perform error handling while suspend or resume is
>>> +     * in progress. Hence the lock_system_sleep() call.
>>> +     */
>>> +    lock_system_sleep();
>>
>> It looks to me like the system takes this lock quite early, even before
>> freezing tasks, so if anything needs the error handler to run it will
>> deadlock.
> 
> Hi Adrian,
> 
> UFS/hba system suspend/resume does not invoke or call error handling in a
> synchronous way. So, whatever UFS errors (which schedules the error handler)
> happens during suspend/resume, error handler will just wait here till system
> suspend/resume release the lock. Hence no worries of deadlock here.

It looks to me like the state can change to UFSHCD_STATE_EH_SCHEDULED_FATAL
and since user processes are not frozen, nor file systems sync'ed, everything
is going to deadlock.
i.e.
I/O is blocked waiting on error handling
error handling is blocked waiting on lock_system_sleep()
suspend is blocked waiting on I/O

> 
> Thanks,
> 
> Can Guo.
> 
>>
>>>      ufshcd_rpm_get_sync(hba);
>>>      if (pm_runtime_status_suspended(&hba->sdev_ufs_device->sdev_gendev) ||
>>>          hba->is_wlu_sys_suspended) {
>>> @@ -5951,6 +5956,7 @@ static void ufshcd_err_handling_unprepare(struct ufs_hba *hba)
>>>          ufshcd_clk_scaling_suspend(hba, false);
>>>      ufshcd_clear_ua_wluns(hba);
>>>      ufshcd_rpm_put(hba);
>>> +    unlock_system_sleep();
>>>  }
>>>
>>>  static inline bool ufshcd_err_handling_should_stop(struct ufs_hba *hba)
>>> @@ -9053,16 +9059,13 @@ static int ufshcd_wl_suspend(struct device *dev)
>>>      ktime_t start = ktime_get();
>>>
>>>      hba = shost_priv(sdev->host);
>>> -    down(&hba->host_sem);
>>>
>>>      if (pm_runtime_suspended(dev))
>>>          goto out;
>>>
>>>      ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM);
>>> -    if (ret) {
>>> +    if (ret)
>>>          dev_err(&sdev->sdev_gendev, "%s failed: %d\n", __func__,  ret);
>>> -        up(&hba->host_sem);
>>> -    }
>>>
>>>  out:
>>>      if (!ret)
>>> @@ -9095,7 +9098,6 @@ static int ufshcd_wl_resume(struct device *dev)
>>>          hba->curr_dev_pwr_mode, hba->uic_link_state);
>>>      if (!ret)
>>>          hba->is_wlu_sys_suspended = false;
>>> -    up(&hba->host_sem);
>>>      return ret;
>>>  }
>>>  #endif
>>>