lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 4 Mar 2020 07:02:46 +0900
From:   Keith Busch <kbusch@...nel.org>
To:     "Jason A. Donenfeld" <Jason@...c4.com>
Cc:     linux-nvme@...ts.infradead.org, linux-ext4@...r.kernel.org
Subject: Re: "I/O 8 QID 0 timeout, reset controller" on 5.6-rc2

On Mon, Mar 02, 2020 at 10:03:39AM +0800, Jason A. Donenfeld wrote:
> Hi,
> 
> My torrent client was doing some I/O when the below happened. I'm
> wondering if this is a known thing that's been fixed during the rc
> cycle, a regression, or if my (pretty new) NVMe drive is failing.
> 
> Thanks,
> Jason
> 
> Feb 24 20:36:58 thinkpad kernel: nvme nvme1: I/O 852 QID 15 timeout, aborting
> Feb 24 20:37:29 thinkpad kernel: nvme nvme1: I/O 852 QID 15 timeout, reset controller
> Feb 24 20:37:59 thinkpad kernel: nvme nvme1: I/O 8 QID 0 timeout, reset controller
> Feb 24 20:39:00 thinkpad kernel: nvme nvme1: Device not ready; aborting reset
> Feb 24 20:39:00 thinkpad kernel: nvme nvme1: Abort status: 0x371

Sorry to say, this indicates the controller has become unresponsive.
You usually see "timeout" messages in batches, though, so I wonder if
only the one IO command timed out or if the controller just doesn't
support an abort command limit.

You can try throttling the queue depth and see if the problem goes away.
The lowest possible depth can be set with kernel param
"nvme.io_queue_depth=2".

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ