[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200416011247.GB11244@42.do-not-panic.com>
Date: Thu, 16 Apr 2020 01:12:47 +0000
From: Luis Chamberlain <mcgrof@...nel.org>
To: Bart Van Assche <bvanassche@....org>
Cc: Christoph Hellwig <hch@...radead.org>, axboe@...nel.dk,
viro@...iv.linux.org.uk, gregkh@...uxfoundation.org,
rostedt@...dmis.org, mingo@...hat.com, jack@...e.cz,
ming.lei@...hat.com, nstange@...e.de, akpm@...ux-foundation.org,
mhocko@...e.com, yukuai3@...wei.com, linux-block@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Omar Sandoval <osandov@...com>,
Hannes Reinecke <hare@...e.com>,
Michal Hocko <mhocko@...nel.org>
Subject: Re: [PATCH 3/5] blktrace: refcount the request_queue during ioctl
On Wed, Apr 15, 2020 at 07:18:22AM -0700, Bart Van Assche wrote:
> On 2020-04-15 05:34, Luis Chamberlain wrote:
> > On Wed, Apr 15, 2020 at 12:14:25AM -0700, Christoph Hellwig wrote:
> >> Btw, Isn't blk_get_queue racy as well? Shouldn't we check
> >> blk_queue_dying after getting the reference and undo it if the queue is
> >> indeeed dying?
> >
> > Yes that race should be possible:
> >
> > bool blk_get_queue(struct request_queue *q)
> > {
> > if (likely(!blk_queue_dying(q))) {
> > ----------> we can get the queue to go dying here <---------
> > __blk_get_queue(q);
> > return true;
> > }
> >
> > return false;
> > }
> > EXPORT_SYMBOL(blk_get_queue);
> >
> > I'll pile up a fix. I've also considered doing a full review of callers
> > outside of the core block layer using it, and maybe just unexporting
> > this. It was originally exported due to commit d86e0e83b ("block: export
> > blk_{get,put}_queue()") to fix a scsi bug, but I can't find such
> > respective fix. I suspec that using bdgrab()/bdput() seems more likely
> > what drivers should be using. That would allow us to keep this
> > functionality internal.
>
> blk_get_queue() prevents concurrent freeing of struct request_queue but
> does not prevent concurrent blk_cleanup_queue() calls.
Wouldn't concurrent blk_cleanup_queue() calls be a bug? If so should
I make it clear that it would be or simply prevent it?
> Callers of
> blk_get_queue() may encounter a change of the queue state from normal
> into dying any time during the blk_get_queue() call or after
> blk_get_queue() has finished. Maybe I'm overlooking something but I
> doubt that modifying blk_get_queue() will help.
Good point, to fix that race described by Christoph we'd have to take
into consideration refcounts of the request_queue to prevent queues from
changing state to dying if the refcount is > 1, however that'd also
would mean not allowing the request_queue from ever dying.
One way we could resolve this could be to to keep track of a
quiesce/dying request, then at that point prevent blk_get_queue() from
allowing increments, and once the refcount is down to 1, flip the switch
to dying.
Luis
Powered by blists - more mailing lists