lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aRvzCs_FXa_liTWv@fedora>
Date: Tue, 18 Nov 2025 12:16:10 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Mohamed Khalfella <mkhalfella@...estorage.com>
Cc: Jens Axboe <axboe@...nel.dk>, Keith Busch <kbusch@...nel.org>,
	Sagi Grimberg <sagi@...mberg.me>,
	Chaitanya Kulkarni <kch@...dia.com>,
	Casey Chen <cachen@...estorage.com>,
	Vikas Manocha <vmanocha@...estorage.com>,
	Yuanyuan Zhong <yzhong@...estorage.com>,
	Hannes Reinecke <hare@...e.de>, linux-nvme@...ts.infradead.org,
	linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] nvme: Convert tag_list mutex to rwsemaphore to
 avoid deadlock

On Mon, Nov 17, 2025 at 07:44:05PM -0800, Mohamed Khalfella wrote:
> On Tue 2025-11-18 10:30:52 +0800, Ming Lei wrote:
> > On Mon, Nov 17, 2025 at 06:15:04PM -0800, Mohamed Khalfella wrote:
> > > On Tue 2025-11-18 10:00:19 +0800, Ming Lei wrote:
> > > > On Mon, Nov 17, 2025 at 12:23:53PM -0800, Mohamed Khalfella wrote:
> > > > >  static void blk_mq_add_queue_tag_set(struct blk_mq_tag_set *set,
> > > > >  				     struct request_queue *q)
> > > > >  {
> > > > > -	mutex_lock(&set->tag_list_lock);
> > > > > +	struct request_queue *firstq;
> > > > > +	unsigned int memflags;
> > > > >  
> > > > > -	/*
> > > > > -	 * Check to see if we're transitioning to shared (from 1 to 2 queues).
> > > > > -	 */
> > > > > -	if (!list_empty(&set->tag_list) &&
> > > > > -	    !(set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) {
> > > > > -		set->flags |= BLK_MQ_F_TAG_QUEUE_SHARED;
> > > > > -		/* update existing queue */
> > > > > -		blk_mq_update_tag_set_shared(set, true);
> > > > > -	}
> > > > > -	if (set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
> > > > > -		queue_set_hctx_shared(q, true);
> > > > > -	list_add_tail(&q->tag_set_list, &set->tag_list);
> > > > > +	down_write(&set->tag_list_rwsem);
> > > > > +	if (!list_is_singular(&set->tag_list)) {
> > > > > +		if (set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
> > > > > +			queue_set_hctx_shared(q, true);
> > > > > +		list_add_tail(&q->tag_set_list, &set->tag_list);
> > > > > +		up_write(&set->tag_list_rwsem);
> > > > > +		return;
> > > > > +	}
> > > > >  
> > > > > -	mutex_unlock(&set->tag_list_lock);
> > > > > +	/* Transitioning firstq and q to shared. */
> > > > > +	set->flags |= BLK_MQ_F_TAG_QUEUE_SHARED;
> > > > > +	list_add_tail(&q->tag_set_list, &set->tag_list);
> > > > > +	downgrade_write(&set->tag_list_rwsem);
> > > > > +	queue_set_hctx_shared(q, true);
> > > > 
> > > > queue_set_hctx_shared(q, true) should be moved into write critical area
> > > > because this queue has been added to the list.
> > > > 
> > > 
> > > I failed to see why that is the case. What can go wrong by running
> > > queue_set_hctx_shared(q, true) after downgrade_write()?
> > > 
> > > After the semaphore is downgraded we promise not to change the list
> > > set->tag_list because now we have read-only access. Marking the "q" as
> > > shared should be fine because it is new and we know there will be no
> > > users of the queue yet (that is why we skipped freezing it).
> >  
> > I think it is read/write lock's use practice. The protected data shouldn't be
> > written any more when you downgrade to read lock.
> > 
> > In this case, it may not make a difference, because it is one new queue and
> > the other readers don't use the `shared` flag, but still better to do
> > correct things from beginning and make code less fragile.
> > 
> 
> set->tag_list_rwsem protects set->tag_list. It does not protect
> hctx->flags. hctx->flags is protected by the context. In the case of "q"
> it is new and we are not expecting request allocation. In case of
> "firstq" the queue is frozen which makes it safe to update hctx->flags.
> I prefer to keep the code as it is unless there is a reason to change
> it.

Fair enough, given it is done for `firstrq`.


Thanks, 
Ming


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ