[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250714021713-mutt-send-email-mst@kernel.org>
Date: Mon, 14 Jul 2025 02:17:31 -0400
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Keith Busch <kbusch@...nel.org>
Cc: Bjorn Helgaas <helgaas@...nel.org>, linux-kernel@...r.kernel.org,
Lukas Wunner <lukas@...ner.de>, Bjorn Helgaas <bhelgaas@...gle.com>,
Parav Pandit <parav@...dia.com>, virtualization@...ts.linux.dev,
stefanha@...hat.com, alok.a.tiwari@...cle.com,
linux-pci@...r.kernel.org
Subject: Re: [PATCH RFC v5 1/5] pci: report surprise removal event
On Wed, Jul 09, 2025 at 05:55:17PM -0600, Keith Busch wrote:
> On Wed, Jul 09, 2025 at 06:38:20PM -0500, Bjorn Helgaas wrote:
> > This relies on somebody (typically pciehp, I guess) calling
> > pci_dev_set_disconnected() when a surprise remove happens.
> >
> > Do you think it would be practical for the driver's .remove() method
> > to recognize that the device may stop responding at any point, even if
> > no hotplug driver is present to call pci_dev_set_disconnected()?
> >
> > Waiting forever for an interrupt seems kind of vulnerable in general.
> > Maybe "artificially adding timeouts" is alluding to *not* waiting
> > forever for interrupts? That doesn't seem artificial to me because
> > it's just a fact of life that devices can disappear at arbitrary
> > times.
>
> I totally agree here. Every driver's .remove() should be able to
> guarantee forward progress some way. I put some work in blk-mq and nvme
> to ensure that happens for those devices at least.
>
> That "forward progress" can come slow though, maybe minutes, so we do
> have opprotunisitic short cuts sprinkled about the driver. There are
> still gaps when waiting for interrupt driven IO that need the longer
> timeouts to trigger. It'd be cool if there was a mechansim to kick in
> quicker, but this is still an uncommon exceptional condition, right?
It's uncommon, yes.
--
MST
Powered by blists - more mailing lists