[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250717091025-mutt-send-email-mst@kernel.org>
Date: Thu, 17 Jul 2025 11:11:44 -0400
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Lukas Wunner <lukas@...ner.de>
Cc: linux-kernel@...r.kernel.org, Keith Busch <kbusch@...nel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Parav Pandit <parav@...dia.com>, virtualization@...ts.linux.dev,
stefanha@...hat.com, alok.a.tiwari@...cle.com,
linux-pci@...r.kernel.org
Subject: Re: [PATCH RFC v5 1/5] pci: report surprise removal event
On Mon, Jul 14, 2025 at 08:11:04AM +0200, Lukas Wunner wrote:
> On Wed, Jul 09, 2025 at 04:55:26PM -0400, Michael S. Tsirkin wrote:
> > At the moment, in case of a surprise removal, the regular remove
> > callback is invoked, exclusively. This works well, because mostly, the
> > cleanup would be the same.
> >
> > However, there's a race: imagine device removal was initiated by a user
> > action, such as driver unbind, and it in turn initiated some cleanup and
> > is now waiting for an interrupt from the device. If the device is now
> > surprise-removed, that never arrives and the remove callback hangs
> > forever.
>
> For PCI devices in a hotplug slot, user space can initiate "safe removal"
> by writing "0" to the hotplug slot's "power" file in sysfs.
>
> If the PCI device is yanked from the slot while safe removal is ongoing,
> there is likewise no way for the driver to know that the device is
> suddenly gone. That's because pciehp_unconfigure_device() only calls
> pci_dev_set_disconnected() in the surprise removal case, not for
> safe removal.
>
> The solution proposed here is thus not a complete one: It may work
> if user space initiated *driver* removal, but not if it initiated *safe*
> removal of the entire device. For virtio, that may be sufficient.
So just as an idea, something like this can work I guess? I'm yet to
test this - wrote this on the go - and also I'll need to implement for
other hotplug drivers, I need it at least for ACPI additonally.
WDYT?
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index bcc938d4420f..46468a1f0244 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -231,6 +231,15 @@ void pciehp_handle_disable_request(struct controller *ctrl)
void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
{
int present, link_active;
+ /*
+ * Always mark downstream devices disconnected on Presence Detect Change.
+ * Covers device yanked during safe removal.
+ */
+ if (events & PCI_EXP_SLTSTA_PDC) {
+ struct pci_bus *parent = ctrl->pcie->port->subordinate;
+ if (parent)
+ pci_walk_bus(parent, pci_dev_set_disconnected, NULL);
+ }
/*
* If the slot is on and presence or link has changed, turn it off.
Powered by blists - more mailing lists