[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251118021504.GC2197103-mkhalfella@purestorage.com>
Date: Mon, 17 Nov 2025 18:15:04 -0800
From: Mohamed Khalfella <mkhalfella@...estorage.com>
To: Ming Lei <ming.lei@...hat.com>
Cc: Jens Axboe <axboe@...nel.dk>, Keith Busch <kbusch@...nel.org>,
Sagi Grimberg <sagi@...mberg.me>,
Chaitanya Kulkarni <kch@...dia.com>,
Casey Chen <cachen@...estorage.com>,
Vikas Manocha <vmanocha@...estorage.com>,
Yuanyuan Zhong <yzhong@...estorage.com>,
Hannes Reinecke <hare@...e.de>, linux-nvme@...ts.infradead.org,
linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] nvme: Convert tag_list mutex to rwsemaphore to
avoid deadlock
On Tue 2025-11-18 10:00:19 +0800, Ming Lei wrote:
> On Mon, Nov 17, 2025 at 12:23:53PM -0800, Mohamed Khalfella wrote:
> > static void blk_mq_add_queue_tag_set(struct blk_mq_tag_set *set,
> > struct request_queue *q)
> > {
> > - mutex_lock(&set->tag_list_lock);
> > + struct request_queue *firstq;
> > + unsigned int memflags;
> >
> > - /*
> > - * Check to see if we're transitioning to shared (from 1 to 2 queues).
> > - */
> > - if (!list_empty(&set->tag_list) &&
> > - !(set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) {
> > - set->flags |= BLK_MQ_F_TAG_QUEUE_SHARED;
> > - /* update existing queue */
> > - blk_mq_update_tag_set_shared(set, true);
> > - }
> > - if (set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
> > - queue_set_hctx_shared(q, true);
> > - list_add_tail(&q->tag_set_list, &set->tag_list);
> > + down_write(&set->tag_list_rwsem);
> > + if (!list_is_singular(&set->tag_list)) {
> > + if (set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
> > + queue_set_hctx_shared(q, true);
> > + list_add_tail(&q->tag_set_list, &set->tag_list);
> > + up_write(&set->tag_list_rwsem);
> > + return;
> > + }
> >
> > - mutex_unlock(&set->tag_list_lock);
> > + /* Transitioning firstq and q to shared. */
> > + set->flags |= BLK_MQ_F_TAG_QUEUE_SHARED;
> > + list_add_tail(&q->tag_set_list, &set->tag_list);
> > + downgrade_write(&set->tag_list_rwsem);
> > + queue_set_hctx_shared(q, true);
>
> queue_set_hctx_shared(q, true) should be moved into write critical area
> because this queue has been added to the list.
>
I failed to see why that is the case. What can go wrong by running
queue_set_hctx_shared(q, true) after downgrade_write()?
After the semaphore is downgraded we promise not to change the list
set->tag_list because now we have read-only access. Marking the "q" as
shared should be fine because it is new and we know there will be no
users of the queue yet (that is why we skipped freezing it).
Powered by blists - more mailing lists