[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250619154610.902892-1-Raju.Rangoju@amd.com>
Date: Thu, 19 Jun 2025 21:16:10 +0530
From: Raju Rangoju <Raju.Rangoju@....com>
To: <bhelgaas@...gle.com>, <linux-pci@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
CC: <lukas@...ner.de>, Raju Rangoju <Raju.Rangoju@....com>, Sanath S
<Sanath.S@....com>
Subject: [PATCH] PCI: pciehp: fix circular lock dependency b/w pci_rescan_remove_lock and reset_lock
Resolves a circular locking dependency issue between
pci_rescan_remove_lock() and ctrl->reset_lock() in the PCI hotplug
subsystem, specifically in the pciehp_unconfigure_device() function.
Commit f5eff5591b8f ("PCI: pciehp: Fix AB-BA deadlock between reset_lock
and device_lock") introduced a change in the locking order within
pciehp_unconfigure_device() to avoid an AB-BA deadlock between
ctrl->reset_lock and device_lock. However, this change inadvertently
introduced a new circular locking dependency between
pci_rescan_remove_lock and ctrl->reset_lock, as detected by lockdep.
The problematic sequence is as follows:
1. pciehp_unconfigure_device() acquires ctrl->reset_lock (read lock).
2. It then acquires pci_rescan_remove_lock.
3. Within the device removal loop, it attempts to reacquire
ctrl->reset_lock after releasing it for driver unbinding,
while still holding pci_rescan_remove_lock.
This creates a potential for deadlock if another thread acquires the
locks in the opposite order, as illustrated by lockdep's report.
To resolve this, change the locking order in pciehp_unconfigure_device()
so that ctrl->reset_lock is released before acquiring
pci_rescan_remove_lock and before driver unbinding. This avoids
holding both locks at the same time and breaks the circular
dependency. After the critical section, ctrl->reset_lock is reacquired
as needed.
This ensures that the locking order is consistent and prevents the
circular dependency, addressing the lockdep warning and potential
deadlock.
[ 120.615285] ======================================================
[ 120.615289] WARNING: possible circular locking dependency detected
[ 120.615293] 6.14.5-300.fc42.x86_64+debug #1 Not tainted
[ 120.615297] ------------------------------------------------------
[ 120.615300] irq/36-pciehp/136 is trying to acquire lock:
[ 120.615303] ffff88810d247340 (&ctrl->reset_lock){.+.+}-{4:4}, at: pciehp_unconfigure_device+0x1d0/0x390
[ 120.615323]
but task is already holding lock:
[ 120.615326] ffffffff9c13ce90 (pci_rescan_remove_lock){+.+.}-{4:4}, at: pciehp_unconfigure_device+0xe5/0x390
[ 120.615337]
which lock already depends on the new lock.
[ 120.615340]
the existing dependency chain (in reverse order) is:
[ 120.615343]
-> #1 (pci_rescan_remove_lock){+.+.}-{4:4}:
[ 120.615351] lock_acquire.part.0+0x133/0x390
[ 120.615359] __mutex_lock+0x1b3/0x1490
[ 120.615366] pciehp_unconfigure_device+0xe5/0x390
[ 120.615370] pciehp_disable_slot+0xfa/0x2f0
[ 120.615376] pciehp_handle_presence_or_link_change+0xc8/0x310
[ 120.615381] pciehp_ist+0x2d5/0x3e0
[ 120.615386] irq_thread_fn+0x88/0x160
[ 120.615393] irq_thread+0x21e/0x490
[ 120.615399] kthread+0x39d/0x760
[ 120.615406] ret_from_fork+0x31/0x70
[ 120.615412] ret_from_fork_asm+0x1a/0x30
[ 120.615418]
-> #0 (&ctrl->reset_lock){.+.+}-{4:4}:
[ 120.615426] check_prev_add+0x1ab/0x23b0
[ 120.615431] __lock_acquire+0x2311/0x2e10
[ 120.615436] lock_acquire.part.0+0x133/0x390
[ 120.615441] down_read_nested+0xa4/0x490
[ 120.615445] pciehp_unconfigure_device+0x1d0/0x390
[ 120.615448] pciehp_disable_slot+0xfa/0x2f0
[ 120.615453] pciehp_handle_presence_or_link_change+0xc8/0x310
[ 120.615458] pciehp_ist+0x2d5/0x3e0
[ 120.615462] irq_thread_fn+0x88/0x160
[ 120.615465] irq_thread+0x21e/0x490
[ 120.615469] kthread+0x39d/0x760
[ 120.615474] ret_from_fork+0x31/0x70
[ 120.615478] ret_from_fork_asm+0x1a/0x30
[ 120.615483]
other info that might help us debug this:
[ 120.615485] Possible unsafe locking scenario:
[ 120.615488] CPU0 CPU1
[ 120.615490] ---- ----
[ 120.615492] lock(pci_rescan_remove_lock);
[ 120.615497] lock(&ctrl->reset_lock);
[ 120.615502] lock(pci_rescan_remove_lock);
[ 120.615506] rlock(&ctrl->reset_lock);
[ 120.615511]
*** DEADLOCK ***
Fixes: f5eff5591b8f ("PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lock")
Co-developed-by: Sanath S <Sanath.S@....com>
Signed-off-by: Sanath S <Sanath.S@....com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@....com>
---
drivers/pci/hotplug/pciehp_pci.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index 65e50bee1a8c..08ba59d96f4a 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -104,6 +104,11 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence)
if (!presence)
pci_walk_bus(parent, pci_dev_set_disconnected, NULL);
+ /*
+ * Release reset_lock before driver unbinding
+ * to avoid AB-BA deadlock with device_lock.
+ */
+ up_read(&ctrl->reset_lock);
pci_lock_rescan_remove();
/*
@@ -116,11 +121,6 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence)
bus_list) {
pci_dev_get(dev);
- /*
- * Release reset_lock during driver unbinding
- * to avoid AB-BA deadlock with device_lock.
- */
- up_read(&ctrl->reset_lock);
pci_stop_and_remove_bus_device(dev);
down_read_nested(&ctrl->reset_lock, ctrl->depth);
@@ -134,8 +134,14 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence)
command |= PCI_COMMAND_INTX_DISABLE;
pci_write_config_word(dev, PCI_COMMAND, command);
}
+ /*
+ * Release reset_lock before driver unbinding
+ * to avoid AB-BA deadlock with device_lock.
+ */
+ up_read(&ctrl->reset_lock);
pci_dev_put(dev);
}
+ down_read_nested(&ctrl->reset_lock, ctrl->depth);
pci_unlock_rescan_remove();
}
--
2.34.1
Powered by blists - more mailing lists