[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aGFBW7wet9V4WENC@wunner.de>
Date: Sun, 29 Jun 2025 15:36:27 +0200
From: Lukas Wunner <lukas@...ner.de>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: linux-kernel@...r.kernel.org, Bjorn Helgaas <bhelgaas@...gle.com>,
linux-pci@...r.kernel.org, Parav Pandit <parav@...dia.com>,
virtualization@...ts.linux.dev, stefanha@...hat.com,
alok.a.tiwari@...cle.com
Subject: Re: [PATCH RFC] pci: report surprise removal events
On Sat, Jun 28, 2025 at 02:58:49PM -0400, Michael S. Tsirkin wrote:
> At the moment, in case of a surprise removal, the regular
> remove callback is invoked, exclusively.
> This works well, because mostly, the cleanup would be the same.
>
> However, there's a race: imagine device removal was initiated by a user
> action, such as driver unbind, and it in turn initiated some cleanup and
> is now waiting for an interrupt from the device. If the device is now
> surprise-removed, that never arrives and the remove callback hangs
> forever.
>
> Drivers can artificially add timeouts to handle that, but it can be
> flaky.
>
> Instead, let's add a way for the driver to be notified about the
> disconnect. It can then do any necessary cleanup, knowing that the
> device is inactive.
[...]
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -549,6 +549,15 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
> pci_dev_set_io_state(dev, pci_channel_io_perm_failure);
> pci_doe_disconnected(dev);
>
> + /* Notify driver of surprise removal */
> + device_lock(&dev->dev);
> +
> + if (dev->driver && dev->driver->err_handler &&
> + dev->driver->err_handler->disconnect)
> + dev->driver->err_handler->disconnect(dev);
> +
> + device_unlock(&dev->dev);
> +
> return 0;
> }
No, that's not good:
1/ The device_lock() will reintroduce the issues solved by 74ff8864cc84.
2/ pci_dev_set_disconnected() needs to be fast so that devices are marked
unplugged as quickly as possible. We want to minimize the time window
where MMIO and Config Space reads already return "all ones" and writes
go to nirvana, but pci_dev_is_disconnected() still returns false.
Hence invoking some driver callback which may take arbitrarily long or
even sleeps is not an option.
The driver is already notified of removal through invocation of the
->remove() callback. The use case you're describing is arguably
a corner case. I do think that a timeout is a better approach
than the one proposed here. How long does it take for the interrupt
to arrive? If it's not just a few msec, consider polling the device
and breaking out of the pool loop as soon as pci_dev_is_disconnected()
returns true (or the MMIO read returns PCI_POSSIBLE_ERROR()).
If/when respinning, please explain the use case in more detail,
i.e. which driver, which device, pointers to code...
Thanks!
Lukas
Powered by blists - more mailing lists