lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4429eed8-b9b9-6943-f76-6ea38d695248@panix.com>
Date:   Thu, 2 Dec 2021 07:13:11 -0800 (PST)
From:   "Kenneth R. Crudup" <kenny@...ix.com>
To:     Jens Axboe <axboe@...nel.dk>
cc:     linux-bcache@...r.kernel.org, linux-block@...r.kernel.org,
        linux-ext4@...r.kernel.org, linux-nvme@...ts.infradead.org,
        "Kenneth R. Crudup" <kenny@...ix.com>, shinichiro.kawasaki@....com
Subject: Re: Write I/O queue hangup at random on recent Linus' kernels


On Thu, 25 Nov 2021, Kenneth R. Crudup wrote:

> > echo 0 > /sys/block/nvme0n1/queue/wbt_lat_usec
> > and that will disable writeback throttling on that device.

> It's been about 48 hours and haven't seen the issue since doing this.

I'm now back to running Linus' master (which includes your fix for this, and
I'm not disabling WBT any longer).

We may still have issues, it appears. Everything was going OK until yesterday,
when I had an SD-Card with an image of an SSD with a dm volume group on it,
that I'd had mounted as a loopback image and activated (... was that clear)?

While I'm not seeing any kernel messages related to my NVMe (root) device, I'm
also seeing the same UI issues as before- the KDE toolbar is unresponsive, and
I don't have full interaction with my desktop UI (i.e., can't click on the
bottom button bar to switch to the active window). I'm pretty sure this is a
symptom of another I/O problem, however.

I tried to unfreeze it by:

  $ echo 0 | sudo tee -a /sys/block/mmcblk0/queue/wbt_lat_usec /sys/block/loop?/queue/wbt_lat_usec

... and a couple of seconds after that, it looked as if some queued-up toolbar
actions spit out (but was then unresponsive), so I'd tried this to see if it
would tell me if I had any throttled/stalled IOs:

  $ egrep -r . $(sudo find /sys/block/*/ -name inflight )

... but they were all zeros.

Does this make sense? Your patch seemed to be block-device-agnostic, but is there
a chance there's a similar path in the "dm" that also needs to be fixed?

Thanks,

	-Kenny

-- 
Kenneth R. Crudup / Sr. SW Engineer, Scott County Consulting, Orange County CA

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ