lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <572cfcc0-197a-9ead-9cb-3c5bf5e735@google.com>
Date:   Fri, 23 Dec 2022 21:24:56 -0800 (PST)
From:   Hugh Dickins <hughd@...gle.com>
To:     Christoph Hellwig <hch@...radead.org>
cc:     Jens Axboe <axboe@...nel.dk>, Keith Busch <kbusch@...nel.org>,
        Sagi Grimberg <sagi@...mberg.me>,
        Chaitanya Kulkarni <kch@...dia.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Thorsten Leemhuis <regressions@...mhuis.info>,
        linux-block@...r.kernel.org, linux-nvme@...ts.infradead.org,
        linux-kernel@...r.kernel.org
Subject: 6.2 nvme-pci: something wrong

Hi Christoph,

There's something wrong with the nvme-pci heading for 6.2-rc1:
no problem booting here on this Lenovo ThinkPad X1 Carbon 5th,
but under load...

nvme nvme0: I/O 0 (I/O Cmd) QID 2 timeout, aborting
nvme nvme0: I/O 1 (I/O Cmd) QID 2 timeout, aborting
nvme nvme0: I/O 2 (I/O Cmd) QID 2 timeout, aborting
nvme nvme0: I/O 3 (I/O Cmd) QID 2 timeout, aborting
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: I/O 0 QID 2 timeout, reset controller

...and more, until I just have to poweroff and reboot.

Bisection points to your
0da7feaa5913 ("nvme-pci: use the tagset alloc/free helpers")
And that does revert cleanly, giving a kernel which shows no problem.

I've spent a while comparing old nvme_pci_alloc_tag_set() and new
nvme_alloc_io_tag_set(), I do not know my way around there at all
and may be talking nonsense, but it did look as if there might now
be a difference in the queue_depth, sqsize, q_depth conversions.

I'm running load successfully with the patch below, but I strongly
suspect that the right patch will be somewhere else: over to you!

Hugh

--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4926,7 +4926,7 @@ int nvme_alloc_io_tag_set(struct nvme_ct
 
 	memset(set, 0, sizeof(*set));
 	set->ops = ops;
-	set->queue_depth = ctrl->sqsize + 1;
+	set->queue_depth = ctrl->sqsize;
 	/*
 	 * Some Apple controllers requires tags to be unique across admin and
 	 * the (only) I/O queue, so reserve the first 32 tags of the I/O queue.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ