linux-kernel - Re: [PATCH] usb: uas: fix usb subsystem hang after power off hub port

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.44L0.1904091014570.1599-100000@iolanthe.rowland.org>
Date:   Tue, 9 Apr 2019 10:44:04 -0400 (EDT)
From:   Alan Stern <stern@...land.harvard.edu>
To:     "Martin K. Petersen" <martin.petersen@...cle.com>
cc:     Kento.A.Kobayashi@...y.com,
        "James E.J. Bottomley" <jejb@...ux.ibm.com>,
        Oliver Neukum <oneukum@...e.com>, <gregkh@...uxfoundation.org>,
        USB Storage list <usb-storage@...ts.one-eyed-alien.net>,
        <Jacky.Cao@...y.com>,
        Kernel development list <linux-kernel@...r.kernel.org>,
        SCSI development list <linux-scsi@...r.kernel.org>,
        USB list <linux-usb@...r.kernel.org>
Subject: Re: [PATCH] usb: uas: fix usb subsystem hang after power off hub
 port

On Mon, 8 Apr 2019, Martin K. Petersen wrote:

> 
> Alan,
> 
> > So it looks as though the SCSI subsystem doesn't like to have a reset 
> > handler call scsi_remove_host.
> 
> Are you talking about a PCI device removal handler or a SCSI error
> handler?

The context of this discussion is a USB mass-storage device where the
device's port on its upstream hub has been powered off.  The
powered-off port causes an executing command to time out.  As a result
the SCSI error handler runs and calls the USB reset routine, but the
reset fails because the kernel is unable to communicate with the device
through the powered-off port.  This causes the USB reset routine to
unbind the device from its USB driver, which in turn calls
scsi_remove_host -- while the error handler is still running.

> > Commands dispatched by the removal routines are forced to wait for the
> > reset recovery to finish, which won't happen until those commands have
> > been completed.
> >
> > Is this a bug in the SCSI core?  If not, we need to know what is the
> > right way to do things when a reset handler detects that the SCSI host
> > has been hot-unplugged.
> 
> PCI surprise removal should generally work. But it's somewhat unusual
> for a SCSI host to evaporate in the middle of error handling. After all,
> the main purpose of eh is to leverage the interfaces provided by the
> host to try to reconnect to a target that tripped and fell off the
> bus...

Still, it's not impossible for a SCSI host to evaporate in the middle
of error handling, given an appropriately mistimed hot-unplug event.  
How does the SCSI layer expect this to be handled?  Should the
low-level driver wait to call scsi_remove_host until after the error
handling is finished?

What about races?  In theory, scsi_remove_host could be called just as 
the error handler is starting up.

Alan Stern