linux-kernel - Re: [PATCH 1/1] nvme-pci: fix hang during error recovery when the PCI device is isolated

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <f9a57983-0fda-9f94-6d21-e62870ca45ae@denx.de>
Date:   Mon, 18 Jul 2022 11:30:54 +0200
From:   Stefan Roese <sr@...x.de>
To:     Keith Busch <kbusch@...nel.org>,
        Niklas Schnelle <schnelle@...ux.ibm.com>
Cc:     Christoph Hellwig <hch@....de>,
        Matthew Rosato <mjrosato@...ux.ibm.com>,
        linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] nvme-pci: fix hang during error recovery when the PCI
 device is isolated

On 12.07.22 16:17, Keith Busch wrote:
> On Tue, Jul 12, 2022 at 02:44:53PM +0200, Niklas Schnelle wrote:
>> On s390 and powerpc PCI devices are isolated when an error is detected
>> and driver->err_handler->error_detected is called with an inaccessible
>> PCI device and PCI channel state set to pci_channel_io_frozen
>> (see Step 1 in Documentation/PCI/pci-error-recovery.rst).
>>
>> In the case of NVMe devices nvme_error_detected() then calls
>> nvme_dev_disable(dev, false) and requests a reset. After a successful
>> reset the device is accessible again and nvme_slot_reset() resets the
>> controller and queues nvme_reset_work() which then recovers the
>> controller.
>>
>> Since commit b98235d3a471 ("nvme-pci: harden drive presence detect in
>> nvme_dev_disable()") however nvme_dev_disable() no longer freezes the
>> queues if pci_device_is_present() returns false. This is the case for an
>> isolated PCI device. In principle this makes sense as there are no
>> accessible hardware queues to run. The problem though is that for
>> a previously live reset controller with online queues nvme_reset_work()
>> calls nvme_wait_freeze() which, without the freeze having been
>> initiated, then hangs forever. Fix this by starting the freeze in
>> nvme_slot_reset() which is the earliest point where we know the device
>> should be accessible again.
>>
>> Fixes: b98235d3a471 ("nvme-pci: harden drive presence detect in nvme_dev_disable()")
>> Signed-off-by: Niklas Schnelle <schnelle@...ux.ibm.com>
> 
> Oh, we've messed up the expected sequence. The mistaken assumption is a device
> not present means we're about to unbind from it, but it could appear that way
> just for normal error handling and reset, so we need to preserve the previous
> handling.
> 
> The offending commit really just wants to avoid the register access (which we
> shouldn't have to do, but hey, broken hardware...).

Correct.

> So let's keep the sequence
> the same as before and just skip the register read. Does this work for you?
>
> ---
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index fdfee3e590db..c40e82cee735 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
>   static void nvme_dev_remove_admin(struct nvme_dev *dev)
> @@ -2690,9 +2772,11 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
>   	struct pci_dev *pdev = to_pci_dev(dev->dev);
>   
>   	mutex_lock(&dev->shutdown_lock);
> -	if (pci_device_is_present(pdev) && pci_is_enabled(pdev)) {
> -		u32 csts = readl(dev->bar + NVME_REG_CSTS);
> +	if (pci_is_enabled(pdev)) {
> +		u32 csts = ~0;
>   
> +		if (pci_device_is_present(pdev))
> +			csts = readl(dev->bar + NVME_REG_CSTS);
>   		if (dev->ctrl.state == NVME_CTRL_LIVE ||
>   		    dev->ctrl.state == NVME_CTRL_RESETTING) {
>   			freeze = true;
> --


Thanks. Looks good to me. So if anyone whats to send a proper patch with
this change, feel free to add my:

Reviewed-by: Stefan Roese <sr@...x.de>

Thanks,
Stefan