lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aTH8opTiwJxH2PMA@kbusch-mbp>
Date: Thu, 4 Dec 2025 14:26:58 -0700
From: Keith Busch <kbusch@...nel.org>
To: Bart Van Assche <bvanassche@....org>
Cc: Mohamed Khalfella <mkhalfella@...estorage.com>,
	Chaitanya Kulkarni <kch@...dia.com>, Christoph Hellwig <hch@....de>,
	Jens Axboe <axboe@...nel.dk>, Sagi Grimberg <sagi@...mberg.me>,
	Casey Chen <cachen@...estorage.com>,
	Yuanyuan Zhong <yzhong@...estorage.com>,
	Hannes Reinecke <hare@...e.de>, Ming Lei <ming.lei@...hat.com>,
	Waiman Long <llong@...hat.com>, Hillf Danton <hdanton@...a.com>,
	linux-nvme@...ts.infradead.org, linux-block@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] block: Use RCU in blk_mq_[un]quiesce_tagset()
 instead of set->tag_list_lock

On Thu, Dec 04, 2025 at 10:24:03AM -1000, Bart Van Assche wrote:
> 
> On 12/4/25 9:57 AM, Mohamed Khalfella wrote:
> > I do not see how running this code in another thread will solve the
> > problem.
> blk_mq_freeze_queue_wait() waits forever because nvme_timeout() waits
> for blk_mq_freeze_queue_wait() to finish. 

No, nvme_timeout does NOT wait for freeze to finish. It wants freeze to
finish, but it proceeds anyway whether it finished or not. If the freeze
didn't completele, anything outstanding after that will be forcefully
reclaimed and requeued once we ensure the device is disabled.

> Hence, the deadlock can be
> solved by removing the blk_mq_quiesce_tagset() call from nvme_timeout()
> and by failing I/O from inside nvme_timeout(). If nvme_timeout() fails
> I/O and does not call blk_mq_quiesce_tagset() then the
> blk_mq_freeze_queue_wait() call will finish instead of triggering a
> deadlock. However, I do not know whether this proposal seems acceptable
> to the NVMe maintainers.

You periodically make this suggestion, but there's never a reason
offered to introduce yet another work queue for the driver to
synchronize with at various points. The whole point of making blk-mq
timeout handler in a work queue (it used to be a timer) was so that we
could do blocking actions like this.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ