[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e1f2ac49-25f4-4b2c-b67c-10782b4e3455@suse.de>
Date: Mon, 14 Apr 2025 13:09:50 +0200
From: Hannes Reinecke <hare@...e.de>
To: "Aithal, Srikanth" <sraithal@....com>, hare@...nel.org
Cc: sagi@...mberg.me, hch@....de, kbusch@...nel.org, Ankit.Soni@....com,
Vasant Hegde <vasant.hegde@....com>, open list
<linux-kernel@...r.kernel.org>,
Linux-Next Mailing List <linux-next@...r.kernel.org>
Subject: Re: Patch "nvme: re-read ANA log page after ns scan completes"
causing regression
On 4/14/25 12:53, Aithal, Srikanth wrote:
> Hello,
>
> With below patch in todays linux-next next-20250414 and v6.15-rc2 we are
> seeing host boot issues. The host with nvme disk just hangs on boot.
>
> If we revert this patch or disable CONFIG_NVME_MULTIPATH then host boots
> fine.
>
> commit 62baf70c327444338c34703c71aa8cc8e4189bd6
> Author: Hannes Reinecke <hare@...nel.org>
> Date: Thu Apr 3 09:19:30 2025 +0200
>
> nvme: re-read ANA log page after ns scan completes
>
> When scanning for new namespaces we might have missed an ANA AEN.
>
> The NVMe base spec (NVMe Base Specification v2.1, Figure 151
> 'Asynchonous
> Event Information - Notice': Asymmetric Namespace Access Change)
> states:
>
> A controller shall not send this even if an Attached Namespace
> Attribute Changed asynchronous event [...] is sent for the same
> event.
>
> so we need to re-read the ANA log page after we rescanned the
> namespace
> list to update the ANA states of the new namespaces.
>
> Signed-off-by: Hannes Reinecke <hare@...nel.org>
> Reviewed-by: Keith Busch <kbusch@...nel.org>
> Signed-off-by: Christoph Hellwig <hch@....de>
>
>
> Host console starts dumping a lot of errors and log size is more than
> 100 MB. So I am not posting all logs here. I am pasting part of the logs
> here:
> ...
> ...
> [ 49.361223] nvme nvme0: controller is down; will reset: CSTS=0x3,
> PCI_STATUS=0x1010
> [ 49.434564] nvme0n1: I/O Cmd(0x2) @ LBA 0, 8 blocks, I/O Error (sct
> 0x3 / sc 0x71)
> [ 49.443123] I/O error, dev nvme0n1, sector 0 op 0x0:(READ) flags
> 0x80700 phys_seg 1 prio class 0
> [ 49.457080] nvme nvme0: Failed to get ANA log: -4
> [ 49.506511] nvme nvme0: D3 entry latency set to 8 seconds
> [ 49.536300] nvme nvme0: 32/0/0 default/read/poll queues
> [ 49.605281] nvme 0000:41:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> domain=0x0018 address=0x0 flags=0x0000]
> [ 80.081190] nvme nvme0: controller is down; will reset: CSTS=0x3,
> PCI_STATUS=0x1010
> [ 80.154109] nvme0n1: I/O Cmd(0x2) @ LBA 128, 8 blocks, I/O Error (sct
> 0x3 / sc 0x71)
> [ 80.162864] I/O error, dev nvme0n1, sector 128 op 0x0:(READ) flags
> 0x80700 phys_seg 1 prio class 0
> [ 80.177032] nvme nvme0: Failed to get ANA log: -4
> [ 80.225460] nvme nvme0: D3 entry latency set to 8 seconds
> [ 80.255395] nvme nvme0: 32/0/0 default/read/poll queues
> [ 80.301278] nvme 0000:41:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> domain=0x0018 address=0x0 flags=0x0000]
> [ 110.789207] nvme nvme0: controller is down; will reset: CSTS=0x3,
> PCI_STATUS=0x1010
> [ 110.861990] nvme0n1: I/O Cmd(0x2) @ LBA 2048, 8 blocks, I/O Error
> (sct 0x3 / sc 0x71)
> [ 110.870842] I/O error, dev nvme0n1, sector 2048 op 0x0:(READ) flags
> 0x80700 phys_seg 1 prio class 0
> [ 110.885040] nvme nvme0: Failed to get ANA log: -4
> [ 110.933460] nvme nvme0: D3 entry latency set to 8 seconds
> [ 110.963447] nvme nvme0: 32/0/0 default/read/poll queues
> [ 111.009276] nvme 0000:41:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> domain=0x0018 address=0x0 flags=0x0000]
> ...
> ...
>
>
Can you try this?
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 78963cab1f74..425c00b02f3e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4455,7 +4455,7 @@ static void nvme_scan_work(struct work_struct *work)
if (test_bit(NVME_AER_NOTICE_NS_CHANGED, &ctrl->events))
nvme_queue_scan(ctrl);
#if CONFIG_NVME_MULTIPATH
- else
+ else if (ctrl->ana_log_buf)
/* Re-read the ANA log page to not miss updates */
queue_work(nvme_wq, &ctrl->ana_work);
#endif
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@...e.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
Powered by blists - more mailing lists