lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0505654c-e487-6b91-57cf-fa7996f5c738@suse.de>
Date:   Mon, 12 Jun 2023 08:09:50 +0200
From:   Hannes Reinecke <hare@...e.de>
To:     Damien Le Moal <dlemoal@...nel.org>,
        Bart Van Assche <bvanassche@....org>,
        Bagas Sanjaya <bagasdotme@...il.com>,
        Pavel Machek <pavel@....cz>
Cc:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Len Brown <len.brown@...el.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Kees Cook <keescook@...omium.org>,
        Tony Luck <tony.luck@...el.com>,
        "Guilherme G. Piccoli" <gpiccoli@...lia.com>,
        Thorsten Leemhuis <linux@...mhuis.info>,
        "James E.J. Bottomley" <jejb@...ux.ibm.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Phillip Potter <phil@...lpotter.co.uk>,
        Joe Breuer <linux-kernel@...reuer.net>,
        Linux Power Management <linux-pm@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux Hardening <linux-hardening@...r.kernel.org>,
        Linux Regressions <regressions@...ts.linux.dev>,
        Linux SCSI <linux-scsi@...r.kernel.org>,
        Alan Stern <stern@...land.harvard.edu>,
        Dan Williams <dan.j.williams@...el.com>,
        Hannes Reinecke <hare@...e.com>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Martin Kepplinger <martin.kepplinger@...i.sm>,
        Kai-Heng Feng <kai.heng.feng@...onical.com>
Subject: Re: Fwd: Waking up from resume locks up on sr device

On 6/12/23 05:09, Damien Le Moal wrote:
> On 6/11/23 00:03, Bart Van Assche wrote:
>> On 6/10/23 06:27, Bagas Sanjaya wrote:
>>> On 6/10/23 15:55, Pavel Machek wrote:
>>>>>> #regzbot introduced: v5.0..v6.4-rc5 https://bugzilla.kernel.org/show_bug.cgi?id=217530
>>>>>> #regzbot title: Waking up from resume locks up on SCSI CD/DVD drive
>>>>>>
>>>>> The reporter had found the culprit (via bisection), so:
>>>>>
>>>>> #regzbot introduced: a19a93e4c6a98c
>>>> Maybe cc the authors of that commit?
>>>
>>> Ah! I forgot to do that! Thanks anyway.
>>
>> Hi Damien,
>>
>> Why does the ATA code call scsi_rescan_device() before system resume has
>> finished? Would ATA devices still work with the patch below applied?
> 
> I do not know the PM code well at all, need to dig into it. But your patch
> worries me as it seems it would prevent rescan of the device on a resume, which
> can be an issue if the device has changed.
> 
> I am not yet 100% clear on the root cause for this, but I think it comes from
> the fact that ata_port_pm_resume() runs before the sci device resume is done, so
> with scsi_dev->power.is_suspended still true. And ata_port_pm_resume() calls
> ata_port_resume_async() which triggers EH (which will do reset + rescan)
> asynchronously. So it looks like we have scsi device resume and libata EH for
> rescan fighting each others for the scan mutex and device lock, leading to deadlock.
> 
> Trying to recreate this issue now to confirm and debug further. But I suspect
> the solution to this may be best implemented in libata, not in scsi.
> This looks definitely related to this thread:
> 
> https://lore.kernel.org/linux-scsi/7b553268-69d3-913a-f9de-28f8d45bdb1e@acm.org/
> 
> Similaraly to your comment on that thread, having to look at
> dev->power.is_suspended is not ideal I think. What we need is to have ata and
> scsi pm resume be synchronized, but I am not yet 100% clear on the scsi layer side.
> 
Which is my feeling, too.
libata runs rescan as part of the device discovery, so really it will 
run after resume. And consequently resume really cannot wait for rescan 
to finish.

What I would be looking at is to decouple resume from libata device 
rescan, and have resume to complete before libata EH runs.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@...e.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ