[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ys2CeIqv//5ZGJTM@kbusch-mbp>
Date: Tue, 12 Jul 2022 08:17:28 -0600
From: Keith Busch <kbusch@...nel.org>
To: Niklas Schnelle <schnelle@...ux.ibm.com>
Cc: Christoph Hellwig <hch@....de>, Stefan Roese <sr@...x.de>,
Matthew Rosato <mjrosato@...ux.ibm.com>,
linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] nvme-pci: fix hang during error recovery when the
PCI device is isolated
On Tue, Jul 12, 2022 at 02:44:53PM +0200, Niklas Schnelle wrote:
> On s390 and powerpc PCI devices are isolated when an error is detected
> and driver->err_handler->error_detected is called with an inaccessible
> PCI device and PCI channel state set to pci_channel_io_frozen
> (see Step 1 in Documentation/PCI/pci-error-recovery.rst).
>
> In the case of NVMe devices nvme_error_detected() then calls
> nvme_dev_disable(dev, false) and requests a reset. After a successful
> reset the device is accessible again and nvme_slot_reset() resets the
> controller and queues nvme_reset_work() which then recovers the
> controller.
>
> Since commit b98235d3a471 ("nvme-pci: harden drive presence detect in
> nvme_dev_disable()") however nvme_dev_disable() no longer freezes the
> queues if pci_device_is_present() returns false. This is the case for an
> isolated PCI device. In principle this makes sense as there are no
> accessible hardware queues to run. The problem though is that for
> a previously live reset controller with online queues nvme_reset_work()
> calls nvme_wait_freeze() which, without the freeze having been
> initiated, then hangs forever. Fix this by starting the freeze in
> nvme_slot_reset() which is the earliest point where we know the device
> should be accessible again.
>
> Fixes: b98235d3a471 ("nvme-pci: harden drive presence detect in nvme_dev_disable()")
> Signed-off-by: Niklas Schnelle <schnelle@...ux.ibm.com>
Oh, we've messed up the expected sequence. The mistaken assumption is a device
not present means we're about to unbind from it, but it could appear that way
just for normal error handling and reset, so we need to preserve the previous
handling.
The offending commit really just wants to avoid the register access (which we
shouldn't have to do, but hey, broken hardware...). So let's keep the sequence
the same as before and just skip the register read. Does this work for you?
---
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index fdfee3e590db..c40e82cee735 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
static void nvme_dev_remove_admin(struct nvme_dev *dev)
@@ -2690,9 +2772,11 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
struct pci_dev *pdev = to_pci_dev(dev->dev);
mutex_lock(&dev->shutdown_lock);
- if (pci_device_is_present(pdev) && pci_is_enabled(pdev)) {
- u32 csts = readl(dev->bar + NVME_REG_CSTS);
+ if (pci_is_enabled(pdev)) {
+ u32 csts = ~0;
+ if (pci_device_is_present(pdev))
+ csts = readl(dev->bar + NVME_REG_CSTS);
if (dev->ctrl.state == NVME_CTRL_LIVE ||
dev->ctrl.state == NVME_CTRL_RESETTING) {
freeze = true;
--
Powered by blists - more mailing lists