[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8574c297-fc02-40d6-ba67-ab43e3d5e394@redhat.com>
Date: Fri, 9 Jan 2026 14:18:13 -0500
From: John Meneghini <jmeneghi@...hat.com>
To: Daniel Wagner <wagi@...nel.org>, Keith Busch <kbusch@...nel.org>,
Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
Sagi Grimberg <sagi@...mberg.me>, James Smart <james.smart@...adcom.com>,
Hannes Reinecke <hare@...e.de>,
Shinichiro Kawasaki <shinichiro.kawasaki@....com>,
Nilay Shroff <nilay@...ux.ibm.com>, Wen Xiong <wenxiong@...ux.ibm.com>,
Narayana Murty N <nnmlinux@...ux.ibm.com>
Cc: linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org,
Ewan Milne <emilne@...hat.com>, Maurizio Lombardi <mlombard@...hat.com>
Subject: Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state
Unfortunately, it has been discovered that this patch causes a serious regression on powerpc platforms.
If anyone has a powerpc platform with an NVMe/PCIe device installed, please run this simple test and see if it works.
# uname -av
Linux rdma-cert-03-lp10.rdma.lab.eng.rdu2.redhat.com 6.19.0-rc4+ #1 SMP Wed Jan 7 21:42:54 EST 2026 ppc64le GNU/Linux
# nvme list-subsys /dev/nvme0n1
nvme-subsys0 - NQN=nqn.1994-11.com.samsung:nvme:PM1735:HHHL:S4WANA0R400032
hostnqn=nqn.2014-08.org.nvmexpress:uuid:1654a627-93b6-4650-ba90-f4dc7a2fd3ee
iopolicy=numa
\
+- nvme0 pcie 0018:01:00.0 live optimized
# nvme subsystem-reset /dev/nvme0; nvme list-subsys /dev/nvme0n1; sleep 1; nvme list-subsys /dev/nvme0n1; nvme list-subsys /dev/nvme0n1;
nvme-subsys0 - NQN=nqn.1994-11.com.samsung:nvme:PM1735:HHHL:S4WANA0R400032
hostnqn=nqn.2014-08.org.nvmexpress:uuid:1654a627-93b6-4650-ba90-f4dc7a2fd3ee
iopolicy=numa
\
+- nvme0 pcie 0018:01:00.0 resetting optimized
[Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
[Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
[Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
[Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
[Wed Jan 7 21:59:51 2026] block nvme0n1: no usable path - requeuing I/O
# nvme list-subsys /dev/nvme0n1;
# nvme list-subsys /dev/nvme0n1;
nvme-subsys0 - NQN=nqn.1994-11.com.samsung:nvme:PM1735:HHHL:S4WANA0R400032
hostnqn=nqn.2014-08.org.nvmexpress:uuid:1654a627-93b6-4650-ba90-f4dc7a2fd3ee
iopolicy=numa
\
+- nvme0 pcie 0018:01:00.0 resetting optimized
At this point the machine is HUNG. It's stuck in the resetting state forever.
Because /dev/nvme0n1 is the root device, I need to power-cycle/reboot the host to recover.
/John
On 2/14/25 3:02 AM, Daniel Wagner wrote:
> The fabric transports and also the PCI transport are not entering the
> LIVE state from NEW or RESETTING. This makes the state machine more
> restrictive and allows to catch not supported state transitions, e.g.
> directly switching from RESETTING to LIVE.
>
> Signed-off-by: Daniel Wagner <wagi@...nel.org>
> ---
> drivers/nvme/host/core.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 818d4e49aab51c388af9a48bf9d466fea9cef51b..f028913e2e622ee348e88879c6e6b7e8f8a1cc82 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -564,8 +564,6 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
> switch (new_state) {
> case NVME_CTRL_LIVE:
> switch (old_state) {
> - case NVME_CTRL_NEW:
> - case NVME_CTRL_RESETTING:
> case NVME_CTRL_CONNECTING:
> changed = true;
> fallthrough;
>
Powered by blists - more mailing lists