[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <a255fcb3a3fdebcd90f84e08b555f1786eb8eba2.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com>
Date: Mon, 23 Mar 2020 17:25:58 -0700
From: sathyanarayanan.kuppuswamy@...ux.intel.com
To: bhelgaas@...gle.com
Cc: linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
ashok.raj@...el.com, sathyanarayanan.kuppuswamy@...ux.intel.com
Subject: [PATCH v18 01/11] PCI/ERR: Update error status after reset_link()
From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
Commit bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") uses
reset_link() to recover from fatal errors. But during fatal error recovery,
if the initial value of error status is PCI_ERS_RESULT_DISCONNECT or
PCI_ERS_RESULT_NO_AER_DRIVER then even after successful recovery (using
reset_link()) pcie_do_recovery() will report the recovery result as
failure. So update the status of error after reset_link().
You can reproduce this issue by triggering a SW DPC using "DPC
Software Trigger" bit in "DPC Control Register". You should see recovery
failed dmesg log as below.
pcieport 0000:00:16.0: DPC: containment event, status:0x1f27 source:0x0000
pcieport 0000:00:16.0: DPC: software trigger detected
pci 0000:04:00.0: AER: can't recover (no error_detected callback)
pcieport 0000:00:16.0: AER: device recovery failed
Fixes: bdb5ac85777d ("PCI/ERR: Handle fatal error recovery")
Link: https://lore.kernel.org/r/15e702a33cc27314f9d43a06ccb408086a229cef.1583286655.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@...gle.com>
Acked-by: Keith Busch <keith.busch@...el.com>
Cc: Ashok Raj <ashok.raj@...el.com>
---
drivers/pci/pcie/err.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 01dfc8bb7ca0..1ac57e9e1e71 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -203,14 +203,14 @@ void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
bus = dev->subordinate;
pci_dbg(dev, "broadcast error_detected message\n");
- if (state == pci_channel_io_frozen)
+ if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, &status);
- else
+ status = reset_link(dev, service);
+ if (status != PCI_ERS_RESULT_RECOVERED)
+ goto failed;
+ } else {
pci_walk_bus(bus, report_normal_detected, &status);
-
- if (state == pci_channel_io_frozen &&
- reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
- goto failed;
+ }
if (status == PCI_ERS_RESULT_CAN_RECOVER) {
status = PCI_ERS_RESULT_RECOVERED;
--
2.17.1
Powered by blists - more mailing lists