[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.44L0.1904091014570.1599-100000@iolanthe.rowland.org>
Date: Tue, 9 Apr 2019 10:44:04 -0400 (EDT)
From: Alan Stern <stern@...land.harvard.edu>
To: "Martin K. Petersen" <martin.petersen@...cle.com>
cc: Kento.A.Kobayashi@...y.com,
"James E.J. Bottomley" <jejb@...ux.ibm.com>,
Oliver Neukum <oneukum@...e.com>, <gregkh@...uxfoundation.org>,
USB Storage list <usb-storage@...ts.one-eyed-alien.net>,
<Jacky.Cao@...y.com>,
Kernel development list <linux-kernel@...r.kernel.org>,
SCSI development list <linux-scsi@...r.kernel.org>,
USB list <linux-usb@...r.kernel.org>
Subject: Re: [PATCH] usb: uas: fix usb subsystem hang after power off hub
port
On Mon, 8 Apr 2019, Martin K. Petersen wrote:
>
> Alan,
>
> > So it looks as though the SCSI subsystem doesn't like to have a reset
> > handler call scsi_remove_host.
>
> Are you talking about a PCI device removal handler or a SCSI error
> handler?
The context of this discussion is a USB mass-storage device where the
device's port on its upstream hub has been powered off. The
powered-off port causes an executing command to time out. As a result
the SCSI error handler runs and calls the USB reset routine, but the
reset fails because the kernel is unable to communicate with the device
through the powered-off port. This causes the USB reset routine to
unbind the device from its USB driver, which in turn calls
scsi_remove_host -- while the error handler is still running.
> > Commands dispatched by the removal routines are forced to wait for the
> > reset recovery to finish, which won't happen until those commands have
> > been completed.
> >
> > Is this a bug in the SCSI core? If not, we need to know what is the
> > right way to do things when a reset handler detects that the SCSI host
> > has been hot-unplugged.
>
> PCI surprise removal should generally work. But it's somewhat unusual
> for a SCSI host to evaporate in the middle of error handling. After all,
> the main purpose of eh is to leverage the interfaces provided by the
> host to try to reconnect to a target that tripped and fell off the
> bus...
Still, it's not impossible for a SCSI host to evaporate in the middle
of error handling, given an appropriately mistimed hot-unplug event.
How does the SCSI layer expect this to be handled? Should the
low-level driver wait to call scsi_remove_host until after the error
handling is finished?
What about races? In theory, scsi_remove_host could be called just as
the error handler is starting up.
Alan Stern
Powered by blists - more mailing lists