lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20260109095603.1088620-1-duziming2@huawei.com>
Date: Fri, 9 Jan 2026 17:56:03 +0800
From: Ziming Du <duziming2@...wei.com>
To: <bhelgaas@...gle.com>, <okaya@...nel.org>, <keith.busch@...el.com>
CC: <linux-pci@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<liuyongqiang13@...wei.com>, <duziming2@...wei.com>
Subject: [PATCH] PCI: Fix AB-BA deadlock between aer_isr() and device_shutdown()

During system shutdown, a deadlock may occur between AER recovery process
and device shutdown as follows:

The device_shutdown path holds the device_lock throughout the entire
process and waits for the irq handlers to complete when release nodes:

  device_shutdown
    device_lock                      # A hold device_lock
    pci_device_shutdown
      pcie_port_device_remove
        remove_iter
          device_unregister
            device_del
              bus_remove_device
                device_release_driver
                  devres_release_all
                    release_nodes    # B wait for irq handlers

The aer_isr path will acquire device_lock in pci_bus_reset():

  aer_isr                            # B execute irq process
    aer_isr_one_error
      aer_process_err_devices
        handle_error_source
          pcie_do_recovery
          aer_root_reset
            pci_bus_error_reset
              pci_bus_reset          # A acquire device_lock

The circular dependency causes system hang. Fix it by using
pci_bus_trylock() instead of pci_bus_lock() in pci_bus_reset(). When the
lock is unavailable, return -EAGAIN, as in similar cases.

Fixes: c4eed62a2143 ("PCI/ERR: Use slot reset if available")
Signed-off-by: Ziming Du <duziming2@...wei.com>
---
 drivers/pci/pci.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 13dbb405dc31..7471bfa6f32e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5515,15 +5515,22 @@ static int pci_bus_reset(struct pci_bus *bus, bool probe)
 	if (probe)
 		return 0;
 
-	pci_bus_lock(bus);
+	/*
+	 * Replace blocking lock with trylock to prevent deadlock during bus reset.
+	 * Same as above except return -EAGAIN if the bus cannot be locked.
+	 */
+	if (pci_bus_trylock(bus)) {
 
-	might_sleep();
+		might_sleep();
 
-	ret = pci_bridge_secondary_bus_reset(bus->self);
+		ret = pci_bridge_secondary_bus_reset(bus->self);
 
-	pci_bus_unlock(bus);
+		pci_bus_unlock(bus);
 
-	return ret;
+		return ret;
+	}
+
+	return -EAGAIN;
 }
 
 /**
-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ