lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <11cfcb55b5302999b0e58b94018f92a379196698.1751136072.git.mst@redhat.com>
Date: Sat, 28 Jun 2025 14:58:49 -0400
From: "Michael S. Tsirkin" <mst@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: Bjorn Helgaas <bhelgaas@...gle.com>, linux-pci@...r.kernel.org,
	Parav Pandit <parav@...dia.com>, virtualization@...ts.linux.dev,
	stefanha@...hat.com, alok.a.tiwari@...cle.com
Subject: [PATCH RFC] pci: report surprise removal events

At the moment, in case of a surprise removal, the regular
remove callback is invoked, exclusively.
This works well, because mostly, the cleanup would be the same.

However, there's a race: imagine device removal was initiated by a user
action, such as driver unbind, and it in turn initiated some cleanup and
is now waiting for an interrupt from the device. If the device is now
surprise-removed, that never arrives and the remove callback hangs
forever.

Drivers can artificially add timeouts to handle that, but it can be
flaky.

Instead, let's add a way for the driver to be notified about the
disconnect. It can then do any necessary cleanup, knowing that the
device is inactive.

Given this is by design kind of asynchronous with normal probe/remove
callbacks, I added it in the pci_error_handlers callback.

Signed-off-by: Michael S. Tsirkin <mst@...hat.com>
---

Warning: build-tested only at this point.

Posting for early flames/feedback.

Cc a bunch of people who discussed this problem specifically in the
virtio blk driver.

 drivers/pci/pci.h   | 9 +++++++++
 include/linux/pci.h | 3 +++
 2 files changed, 12 insertions(+)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index b81e99cd4b62..78b064be10d5 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -549,6 +549,15 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
 	pci_dev_set_io_state(dev, pci_channel_io_perm_failure);
 	pci_doe_disconnected(dev);
 
+	/* Notify driver of surprise removal */
+	device_lock(&dev->dev);
+
+	if (dev->driver && dev->driver->err_handler &&
+	    dev->driver->err_handler->disconnect)
+		dev->driver->err_handler->disconnect(dev);
+
+	device_unlock(&dev->dev);
+
 	return 0;
 }
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 51e2bd6405cd..30a8c7ee09f6 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -878,6 +878,9 @@ struct pci_error_handlers {
 	/* PCI slot has been reset */
 	pci_ers_result_t (*slot_reset)(struct pci_dev *dev);
 
+	/* PCI slot has been disconnected */
+        void (*disconnect)(struct pci_dev *dev);
+
 	/* PCI function reset prepare or completed */
 	void (*reset_prepare)(struct pci_dev *dev);
 	void (*reset_done)(struct pci_dev *dev);
-- 
MST


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ