lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250619154610.902892-1-Raju.Rangoju@amd.com>
Date: Thu, 19 Jun 2025 21:16:10 +0530
From: Raju Rangoju <Raju.Rangoju@....com>
To: <bhelgaas@...gle.com>, <linux-pci@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>
CC: <lukas@...ner.de>, Raju Rangoju <Raju.Rangoju@....com>, Sanath S
	<Sanath.S@....com>
Subject: [PATCH] PCI: pciehp: fix circular lock dependency b/w pci_rescan_remove_lock and reset_lock

Resolves a circular locking dependency issue between
pci_rescan_remove_lock() and ctrl->reset_lock() in the PCI hotplug
subsystem, specifically in the pciehp_unconfigure_device() function.

Commit f5eff5591b8f ("PCI: pciehp: Fix AB-BA deadlock between reset_lock
 and device_lock") introduced a change in the locking order within
pciehp_unconfigure_device() to avoid an AB-BA deadlock between
ctrl->reset_lock and device_lock. However, this change inadvertently
introduced a new circular locking dependency between
pci_rescan_remove_lock and ctrl->reset_lock, as detected by lockdep.

The problematic sequence is as follows:
  1. pciehp_unconfigure_device() acquires ctrl->reset_lock (read lock).
  2. It then acquires pci_rescan_remove_lock.
  3. Within the device removal loop, it attempts to reacquire
     ctrl->reset_lock after releasing it for driver unbinding,
     while still holding pci_rescan_remove_lock.

This creates a potential for deadlock if another thread acquires the
locks in the opposite order, as illustrated by lockdep's report.

To resolve this, change the locking order in pciehp_unconfigure_device()
so that ctrl->reset_lock is released before acquiring
pci_rescan_remove_lock and before driver unbinding. This avoids
holding both locks at the same time and breaks the circular
dependency. After the critical section, ctrl->reset_lock is reacquired
as needed.

This ensures that the locking order is consistent and prevents the
circular dependency, addressing the lockdep warning and potential
deadlock.

[  120.615285] ======================================================
[  120.615289] WARNING: possible circular locking dependency detected
[  120.615293] 6.14.5-300.fc42.x86_64+debug #1 Not tainted
[  120.615297] ------------------------------------------------------
[  120.615300] irq/36-pciehp/136 is trying to acquire lock:
[  120.615303] ffff88810d247340 (&ctrl->reset_lock){.+.+}-{4:4}, at: pciehp_unconfigure_device+0x1d0/0x390
[  120.615323]
               but task is already holding lock:
[  120.615326] ffffffff9c13ce90 (pci_rescan_remove_lock){+.+.}-{4:4}, at: pciehp_unconfigure_device+0xe5/0x390
[  120.615337]
               which lock already depends on the new lock.

[  120.615340]
               the existing dependency chain (in reverse order) is:
[  120.615343]
               -> #1 (pci_rescan_remove_lock){+.+.}-{4:4}:
[  120.615351]        lock_acquire.part.0+0x133/0x390
[  120.615359]        __mutex_lock+0x1b3/0x1490
[  120.615366]        pciehp_unconfigure_device+0xe5/0x390
[  120.615370]        pciehp_disable_slot+0xfa/0x2f0
[  120.615376]        pciehp_handle_presence_or_link_change+0xc8/0x310
[  120.615381]        pciehp_ist+0x2d5/0x3e0
[  120.615386]        irq_thread_fn+0x88/0x160
[  120.615393]        irq_thread+0x21e/0x490
[  120.615399]        kthread+0x39d/0x760
[  120.615406]        ret_from_fork+0x31/0x70
[  120.615412]        ret_from_fork_asm+0x1a/0x30
[  120.615418]
               -> #0 (&ctrl->reset_lock){.+.+}-{4:4}:
[  120.615426]        check_prev_add+0x1ab/0x23b0
[  120.615431]        __lock_acquire+0x2311/0x2e10
[  120.615436]        lock_acquire.part.0+0x133/0x390
[  120.615441]        down_read_nested+0xa4/0x490
[  120.615445]        pciehp_unconfigure_device+0x1d0/0x390
[  120.615448]        pciehp_disable_slot+0xfa/0x2f0
[  120.615453]        pciehp_handle_presence_or_link_change+0xc8/0x310
[  120.615458]        pciehp_ist+0x2d5/0x3e0
[  120.615462]        irq_thread_fn+0x88/0x160
[  120.615465]        irq_thread+0x21e/0x490
[  120.615469]        kthread+0x39d/0x760
[  120.615474]        ret_from_fork+0x31/0x70
[  120.615478]        ret_from_fork_asm+0x1a/0x30
[  120.615483]
               other info that might help us debug this:

[  120.615485]  Possible unsafe locking scenario:

[  120.615488]        CPU0                    CPU1
[  120.615490]        ----                    ----
[  120.615492]   lock(pci_rescan_remove_lock);
[  120.615497]                                lock(&ctrl->reset_lock);
[  120.615502]                                lock(pci_rescan_remove_lock);
[  120.615506]   rlock(&ctrl->reset_lock);
[  120.615511]
                *** DEADLOCK ***

Fixes: f5eff5591b8f ("PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lock")
Co-developed-by: Sanath S <Sanath.S@....com>
Signed-off-by: Sanath S <Sanath.S@....com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@....com>
---
 drivers/pci/hotplug/pciehp_pci.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index 65e50bee1a8c..08ba59d96f4a 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -104,6 +104,11 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence)
 	if (!presence)
 		pci_walk_bus(parent, pci_dev_set_disconnected, NULL);
 
+	/*
+	 * Release reset_lock before driver unbinding
+	 * to avoid AB-BA deadlock with device_lock.
+	 */
+	up_read(&ctrl->reset_lock);
 	pci_lock_rescan_remove();
 
 	/*
@@ -116,11 +121,6 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence)
 					 bus_list) {
 		pci_dev_get(dev);
 
-		/*
-		 * Release reset_lock during driver unbinding
-		 * to avoid AB-BA deadlock with device_lock.
-		 */
-		up_read(&ctrl->reset_lock);
 		pci_stop_and_remove_bus_device(dev);
 		down_read_nested(&ctrl->reset_lock, ctrl->depth);
 
@@ -134,8 +134,14 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence)
 			command |= PCI_COMMAND_INTX_DISABLE;
 			pci_write_config_word(dev, PCI_COMMAND, command);
 		}
+		/*
+		 * Release reset_lock before driver unbinding
+		 * to avoid AB-BA deadlock with device_lock.
+		 */
+		up_read(&ctrl->reset_lock);
 		pci_dev_put(dev);
 	}
 
+	down_read_nested(&ctrl->reset_lock, ctrl->depth);
 	pci_unlock_rescan_remove();
 }
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ