lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <fa04f3d5-56ff-62bb-0afd-ad94e961ddee@gmail.com>
Date: Fri, 25 Apr 2025 20:01:45 +0800
From: Linjun Bao <meljbao@...il.com>
To: Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...com>,
 Christoph Hellwig <hch@....de>, Sagi Grimberg <sagi@...mberg.me>,
 linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org
Subject: [PATCH] nvme: avoid missing db ring during reset

During nvme reset, there is a rare case, when user admin cmd such
as smart-log and nvme_admin_create_sq from nvme_setup_io_queues
happen to in the same blk_mq dispatch list, and the user cmd is
the last one. nvme_admin_create_sq is dispatched first in
nvme_queue_rq(), nvme_write_sq_db() is called but immediately
returns without writing the doorbell because it's not masked
"last". The subsequent smart-log ioctl fails fast hitting
nvme_fail_nonready_cmd(), skipping both nvme_sq_copy_cmd() and
nvme_write_sq_db(), so no doorbell write ever occurs. The
nvme_admin_create_sq fails timeout finally.

The proposal is that do not treat user admin cmd during
RECONNECTING as non-ready, through it to the drive, thus no
doorbell missing happens in case above.

Signed-off-by: Linjun Bao <meljbao@...il.com>
---
 drivers/nvme/host/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 3cc79817e4d7..fc550226ed77 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -750,7 +750,8 @@ bool __nvme_check_ready(struct nvme_ctrl *ctrl, struct request *rq,
 	 * sequence. until the controller will be LIVE, fail with
 	 * BLK_STS_RESOURCE so that they will be rescheduled.
 	 */
-	if (rq->q == ctrl->admin_q && (req->flags & NVME_REQ_USERCMD))
+	if (rq->q == ctrl->admin_q && (req->flags & NVME_REQ_USERCMD) &&
+	    (nvme_ctrl_state(ctrl) != NVME_CTRL_CONNECTING))
 		return false;
 
 	if (ctrl->ops->flags & NVME_F_FABRICS) {
-- 
2.25.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ