lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 6 Jul 2022 14:17:38 +0200
From:   Oliver Neukum <oneukum@...e.com>
To:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        Greg KH <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     Len Brown <len.brown@...el.com>, Pavel Machek <pavel@....cz>,
        Arnd Bergmann <arnd@...db.de>, linux-kernel@...r.kernel.org,
        linux-pm@...r.kernel.org,
        Wedson Almeida Filho <wedsonaf@...gle.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Arjan van de Ven <arjan@...ux.intel.com>
Subject: Re: [PATCH] char: misc: make misc_open() and misc_register() killable



On 06.07.22 12:26, Tetsuo Handa wrote:

> wait_for_device_probe() in snapshot_open() was added by commit c751085943362143
> ("PM/Hibernate: Wait for SCSI devices scan to complete during resume"), and
> that commit did not take into account possibility of unresponsive hardware.
> 
>    "In addition, if the resume from hibernation is userland-driven, it's
>     better to wait for all device probes in the kernel to complete before
>     attempting to open the resume device."
> 
> 

Testsuo-san,

I am afraid my first reply was too court to be useful. Sorry for that.
First let me congratulate you for finding and analyzing an important
issue.
Yet, I am afraid while your analysis is good, your attempt at a fix
suffers from being too close to the analysis, instead of taking a step
back and looking at root causes.
Frankly I was afraid you'd look at UAS next and try to fix it in the
same way. And that is the core of the issue. IF the SCSI layer can be
made to hang a host controller by an unresponsive device, the issue
is in the SCSI layer. If you were to insist on your current approach
you'd have to go through every host controller driver. You are just
seeing this only with storage because you are fuzzing USB, not SCSI.
But the bug you found is more fundamental than a single bus system.

The SCSI layer is just designed in such a way that timeouts are handled
by the core. That is a fundamental design decision you cannot easily
deviate from. Hence I would like to ask you to take a closer look
at the scanning code in the SCSI layer, not a host controller driver.

	Regards
		Oliver

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ