[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240514135303.176134-2-krishnak@linux.ibm.com>
Date: Tue, 14 May 2024 19:22:58 +0530
From: Krishna Kumar <krishnak@...ux.ibm.com>
To: linuxppc-dev@...ts.ozlabs.org
Cc: linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
mahesh@...ux.ibm.com, Krishna Kumar <krishnak@...ux.ibm.com>,
Michael Ellerman <mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>,
Christophe Leroy <christophe.leroy@...roup.eu>,
"Aneesh Kumar K.V" <aneesh.kumar@...nel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Gaurav Batra <gbatra@...ux.ibm.com>,
Nathan Lynch <nathanl@...ux.ibm.com>,
Brian King <brking@...ux.vnet.ibm.com>
Subject: [PATCH v2 1/2] pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv
Description of the problem: The hotplug driver for powerpc
(pci/hotplug/pnv_php.c) gives kernel crash when we try to
hot-unplug/disable the PCIe switch/bridge from the PHB.
Root Cause of Crash: The crash is due to the reason that, though the msi
data structure has been released during disable/hot-unplug path and it
has been assigned with NULL, still during unregistartion the code was
again trying to explicitly disable the msi which causes the Null pointer
dereference and kernel crash.
Proposed Fix : The fix is to correct the check during unregistration path
so that the code should not try to invoke pci_disable_msi/msix() if its
data structure is already freed.
Cc: Michael Ellerman <mpe@...erman.id.au>
Cc: Nicholas Piggin <npiggin@...il.com>
Cc: Christophe Leroy <christophe.leroy@...roup.eu>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@...nel.org>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>
Cc: Gaurav Batra <gbatra@...ux.ibm.com>
Cc: Nathan Lynch <nathanl@...ux.ibm.com>
Cc: Brian King <brking@...ux.vnet.ibm.com>
Signed-off-by: Krishna Kumar <krishnak@...ux.ibm.com>
---
Command used for reproducing the bug:
echo 0 > /sys/bus/pci/slots/C5/power
where C5 is slot associated with bridge.
Snippet of Crash:
Kernel attempted to read user page (10) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x00000010
Faulting instruction address: 0xc000000000fad7d4
Oops: Kernel access of bad area, sig: 11 [#1]
Hardware name: 5105-22E POWER9 0x4e1203 opal:v7.0-39-g4660e63a PowerNV
NIP [c000000000fad7d4] mutex_lock+0x34/0x88
LR [c000000000fad7c8] mutex_lock+0x28/0x88
Call Trace:
[c00000017075f940] [c000000000fad7c8] mutex_lock+0x28/0x88 (unreliable)
[c00000017075f970] [c000000000214464] msi_lock_descs+0x28/0x3c
[c00000017075f990] [c0000000008e8be8] pci_disable_msi+0x68/0xa8
[c00000017075f9c0] [c00000000090f0a4] pnv_php_disable_irq+0x2a0/0x2b0
[c00000017075fab0] [c00000000090f128] pnv_php_free_slot+0x74/0xc4
[c00000017075fb30] [c000000000912184] pnv_php_disable_slot.part.0+0x1b8/0x24c
[c00000017075fc00] [c000000000902df0] power_write_file+0xf8/0x18c
[c00000017075fc80] [c0000000008f84d8] pci_slot_attr_store+0x40/0x5c
[c00000017075fca0] [c0000000006b4834] sysfs_kf_write+0x64/0x78
[c00000017075fcc0] [c0000000006b3300] kernfs_fop_write_iter+0x1b8/0x2dc
[c00000017075fd10] [c0000000005b3eb0] vfs_write+0x224/0x4e8
[c00000017075fdc0] [c0000000005b44b0] ksys_write+0x88/0x150
[c00000017075fe10] [c000000000030864] system_call_exception+0x124/0x320
[c00000017075fe50] [c00000000000cedc] system_call_vectored_common+0x15c/0x2ec
--- interrupt: 3000 at 0x7fffb9748774
Root-Cause: Its safe to call pci_disable_msi() if its associated data structre are
not freed (during bailout path). But when the driver code disables the
bridge during hot-unplug operation, its msi data structure becomes NULL
(php_slot->pdev->dev.msi.data:0000000000000000). This happens before
unregistration and during disable path in function
msi_device_data_release(). In this case, its not safe to explicitly call
pci_disable_msi/msix() due to NULL pointer dereference. But since the
current code does so, the crash is happening at the line
mutex_lock(&dev->msi.data->mutex).
FIX: In the current code, there are two paths to invoke
pci_disable_msi/msix(). In the error/bailout path, first argument of the
check - if(disable_device || irq > 0), i.e. disable_device is true, so
it will always invoke pci_disable_msi/msix(), it will never depend on
second argument. In this path it's fine to call pci_disable_msi/msix().
During the slot releasing/disable/hot-unpug path the disable_device is
false, irq is having old value which is making the overall check true
and causing the crash. Of course, we should not choose the old/stale
value of irq but should choose php_slot->irq for check. Also, since
php_slot->irq value is always 0 before the check, so it does not matter
if it will not be included into the check. So, the check can be formed
with only one argument i.e. disable_device. Based on its value
pci_disable_msi/msix() will be invoked and this is the fix for the
crash. During the bailout path its value will be true and during the
hot-unplug operation on the bridge slot, its value will be false.
drivers/pci/hotplug/pnv_php.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index 694349be9d0a..573a41869c15 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -40,7 +40,6 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
bool disable_device)
{
struct pci_dev *pdev = php_slot->pdev;
- int irq = php_slot->irq;
u16 ctrl;
if (php_slot->irq > 0) {
@@ -59,7 +58,7 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
php_slot->wq = NULL;
}
- if (disable_device || irq > 0) {
+ if (disable_device) {
if (pdev->msix_enabled)
pci_disable_msix(pdev);
else if (pdev->msi_enabled)
--
2.45.0
Powered by blists - more mailing lists