[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250708222530.1041477-1-himanshu.madhani@oracle.com>
Date: Tue, 8 Jul 2025 22:25:30 +0000
From: himanshu.madhani@...cle.com
To: linux-kernel@...r.kernel.org
Cc: tglx@...utronix.de, Himanshu Madhani <himanshu.madhani@...cle.com>,
Jorge Lopez <jorge.jo.lopez@...cle.com>
Subject: [PATCH] PCI/THP: Fix hang due to incorrect guard lock
From: Himanshu Madhani <himanshu.madhani@...cle.com>
Fix system hang due to incorrect mutex lock placement.
following stack trace will be seen on system boot
[ 525.664681] task:systemd-shutdow state:D stack:0 pid:1 tgid:1 ppid:0 task_flags:0x400100 flags:0x00004002
[ 525.796878] Call Trace:
[ 525.826116] <TASK>
[ 525.851195] __schedule+0x2d1/0x730
[ 525.892917] schedule+0x27/0x80
[ 525.930478] schedule_preempt_disabled+0x15/0x30
[ 525.985718] __mutex_lock.constprop.0+0x4be/0x8a0
[ 526.041993] msi_domain_get_virq+0xcc/0x110
[ 526.092031] pci_msix_write_tph_tag+0x3c/0x100
[ 526.145186] pcie_tph_set_st_entry+0x125/0x1d0
[ 526.198346] bnxt_irq_affinity_release+0x35/0x50 [bnxt_en]
[ 526.264015] irq_set_affinity_notifier+0xe0/0x130
[ 526.320291] bnxt_free_irq+0x6e/0x110 [bnxt_en]
[ 526.374507] __bnxt_close_nic.isra.0+0x1eb/0x220 [bnxt_en]
[ 526.440175] bnxt_close+0x3a/0x100 [bnxt_en]
[ 526.491264] __dev_close_many+0xae/0x220
[ 526.538179] dev_close_many+0xc2/0x1b0
[ 526.583014] netif_close+0x9d/0xd0
[ 526.623693] bnxt_shutdown+0xb1/0xe0 [bnxt_en]
[ 526.676874] pci_device_shutdown+0x35/0x70
[ 526.725871] device_shutdown+0x118/0x1a0
[ 526.772788] kernel_restart+0x3a/0x70
[ 526.816588] __do_sys_reboot+0x150/0x250
[ 526.863504] do_syscall_64+0x84/0x940
[ 526.907300] ? __put_user_8+0xd/0x20
[ 526.950059] ? rseq_ip_fixup+0x90/0x1e0
[ 526.995937] ? task_mm_cid_work+0x1ad/0x220
[ 527.045971] ? __rseq_handle_notify_resume+0x35/0x90
[ 527.105367] ? arch_exit_to_user_mode_prepare.isra.0+0x98/0xb0
[ 527.175166] ? do_syscall_64+0xba/0x940
[ 527.221040] ? do_filp_open+0xd7/0x1a0
[ 527.265882] ? alloc_fd+0xba/0x110
[ 527.306556] ? do_sys_openat2+0xa4/0xf0
[ 527.352434] ? __x64_sys_openat+0x54/0xb0
[ 527.400389] ? arch_exit_to_user_mode_prepare.isra.0+0x9/0xb0
[ 527.469150] ? do_syscall_64+0xba/0x940
[ 527.515023] ? do_user_addr_fault+0x221/0x690
[ 527.567141] ? clear_bhb_loop+0x30/0x80
[ 527.613017] ? clear_bhb_loop+0x30/0x80
[ 527.658895] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 527.719332] RIP: 0033:0x7fc3ec504777
[ 527.762091] RSP: 002b:00007ffecd62c4f8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a9
[ 527.852685] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc3ec504777
[ 527.938085] RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
[ 528.023485] RBP: 00007ffecd62c700 R08: 0000000000000000 R09: 00007ffecd62b8e0
[ 528.108878] R10: 0000000000000001 R11: 0000000000000202 R12: 00007ffecd62c568
[ 528.194273] R13: 00007ffecd62c548 R14: 00007ffecd62c568 R15: 0000000000000000
[ 528.279672] </TASK>
Fixes: d5124a9957b2 ("PCI/MSI: Provide a sane mechanism for TPH")
Fixes: 71296eae5887 ("PCI/TPH: Replace the broken MSI-X control word update")
Reported-by: Jorge Lopez <jorge.jo.lopez@...cle.com>
Suggested-by: Thomas Gleixner <tglx@...utronix.de>
Tested-by: Jorge Lopez <jorge.jo.lopez@...cle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@...cle.com>
---
drivers/pci/msi/msi.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index 6ede55a7c5e6..d686488f4111 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -934,10 +934,12 @@ int pci_msix_write_tph_tag(struct pci_dev *pdev, unsigned int index, u16 tag)
if (!pdev->msix_enabled)
return -ENXIO;
- guard(msi_descs_lock)(&pdev->dev);
virq = msi_get_virq(&pdev->dev, index);
if (!virq)
return -ENXIO;
+
+ guard(msi_descs_lock)(&pdev->dev);
+
/*
* This is a horrible hack, but short of implementing a PCI
* specific interrupt chip callback and a huge pile of
--
2.41.0.rc2
Powered by blists - more mailing lists