[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251118034405.GB2376676-mkhalfella@purestorage.com>
Date: Mon, 17 Nov 2025 19:44:05 -0800
From: Mohamed Khalfella <mkhalfella@...estorage.com>
To: Ming Lei <ming.lei@...hat.com>
Cc: Jens Axboe <axboe@...nel.dk>, Keith Busch <kbusch@...nel.org>,
Sagi Grimberg <sagi@...mberg.me>,
Chaitanya Kulkarni <kch@...dia.com>,
Casey Chen <cachen@...estorage.com>,
Vikas Manocha <vmanocha@...estorage.com>,
Yuanyuan Zhong <yzhong@...estorage.com>,
Hannes Reinecke <hare@...e.de>, linux-nvme@...ts.infradead.org,
linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] nvme: Convert tag_list mutex to rwsemaphore to
avoid deadlock
On Tue 2025-11-18 10:30:52 +0800, Ming Lei wrote:
> On Mon, Nov 17, 2025 at 06:15:04PM -0800, Mohamed Khalfella wrote:
> > On Tue 2025-11-18 10:00:19 +0800, Ming Lei wrote:
> > > On Mon, Nov 17, 2025 at 12:23:53PM -0800, Mohamed Khalfella wrote:
> > > > static void blk_mq_add_queue_tag_set(struct blk_mq_tag_set *set,
> > > > struct request_queue *q)
> > > > {
> > > > - mutex_lock(&set->tag_list_lock);
> > > > + struct request_queue *firstq;
> > > > + unsigned int memflags;
> > > >
> > > > - /*
> > > > - * Check to see if we're transitioning to shared (from 1 to 2 queues).
> > > > - */
> > > > - if (!list_empty(&set->tag_list) &&
> > > > - !(set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) {
> > > > - set->flags |= BLK_MQ_F_TAG_QUEUE_SHARED;
> > > > - /* update existing queue */
> > > > - blk_mq_update_tag_set_shared(set, true);
> > > > - }
> > > > - if (set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
> > > > - queue_set_hctx_shared(q, true);
> > > > - list_add_tail(&q->tag_set_list, &set->tag_list);
> > > > + down_write(&set->tag_list_rwsem);
> > > > + if (!list_is_singular(&set->tag_list)) {
> > > > + if (set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
> > > > + queue_set_hctx_shared(q, true);
> > > > + list_add_tail(&q->tag_set_list, &set->tag_list);
> > > > + up_write(&set->tag_list_rwsem);
> > > > + return;
> > > > + }
> > > >
> > > > - mutex_unlock(&set->tag_list_lock);
> > > > + /* Transitioning firstq and q to shared. */
> > > > + set->flags |= BLK_MQ_F_TAG_QUEUE_SHARED;
> > > > + list_add_tail(&q->tag_set_list, &set->tag_list);
> > > > + downgrade_write(&set->tag_list_rwsem);
> > > > + queue_set_hctx_shared(q, true);
> > >
> > > queue_set_hctx_shared(q, true) should be moved into write critical area
> > > because this queue has been added to the list.
> > >
> >
> > I failed to see why that is the case. What can go wrong by running
> > queue_set_hctx_shared(q, true) after downgrade_write()?
> >
> > After the semaphore is downgraded we promise not to change the list
> > set->tag_list because now we have read-only access. Marking the "q" as
> > shared should be fine because it is new and we know there will be no
> > users of the queue yet (that is why we skipped freezing it).
>
> I think it is read/write lock's use practice. The protected data shouldn't be
> written any more when you downgrade to read lock.
>
> In this case, it may not make a difference, because it is one new queue and
> the other readers don't use the `shared` flag, but still better to do
> correct things from beginning and make code less fragile.
>
set->tag_list_rwsem protects set->tag_list. It does not protect
hctx->flags. hctx->flags is protected by the context. In the case of "q"
it is new and we are not expecting request allocation. In case of
"firstq" the queue is frozen which makes it safe to update hctx->flags.
I prefer to keep the code as it is unless there is a reason to change
it.
Powered by blists - more mailing lists