[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49bfcbe0-2630-5c82-f305-fcee489ac9ea@acm.org>
Date: Wed, 15 Apr 2020 07:45:18 -0700
From: Bart Van Assche <bvanassche@....org>
To: Luis Chamberlain <mcgrof@...nel.org>,
Christoph Hellwig <hch@...radead.org>
Cc: axboe@...nel.dk, viro@...iv.linux.org.uk,
gregkh@...uxfoundation.org, rostedt@...dmis.org, mingo@...hat.com,
jack@...e.cz, ming.lei@...hat.com, nstange@...e.de,
akpm@...ux-foundation.org, mhocko@...e.com, yukuai3@...wei.com,
linux-block@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Omar Sandoval <osandov@...com>,
Hannes Reinecke <hare@...e.com>,
Michal Hocko <mhocko@...nel.org>
Subject: Re: [PATCH 3/5] blktrace: refcount the request_queue during ioctl
On 2020-04-14 23:16, Luis Chamberlain wrote:
> On Tue, Apr 14, 2020 at 08:40:44AM -0700, Christoph Hellwig wrote:
>> Hmm, where exactly does the race come in so that it can only happen
>> after where you take the reference, but not before it? I'm probably
>> missing something, but that just means it needs to be explained a little
>> better :)
>
>>>From the trace on patch 2/5:
>
> BLKTRACE_SETUP(loop0) #2
> [ 13.933961] == blk_trace_ioctl(2, BLKTRACESETUP) start
> [ 13.936758] === do_blk_trace_setup(2) start
> [ 13.938944] === do_blk_trace_setup(2) creating directory
> [ 13.941029] === do_blk_trace_setup(2) using what debugfs_lookup() gave
>
> ---> From LOOP_CTL_DEL(loop0) #2
> [ 13.971046] === blk_trace_cleanup(7) end
> [ 13.973175] == __blk_trace_remove(7) end
> [ 13.975352] == blk_trace_shutdown(7) end
> [ 13.977415] = __blk_release_queue(7) calling blk_mq_debugfs_unregister()
> [ 13.980645] ==== blk_mq_debugfs_unregister(7) begin
> [ 13.980696] ==== blk_mq_debugfs_unregister(7) debugfs_remove_recursive(q->debugfs_dir)
> [ 13.983118] ==== blk_mq_debugfs_unregister(7) end q->debugfs_dir is NULL
> [ 13.986945] = __blk_release_queue(7) blk_mq_debugfs_unregister() end
> [ 13.993155] = __blk_release_queue(7) end
>
> ---> From BLKTRACE_SETUP(loop0) #2
> [ 13.995928] === do_blk_trace_setup(2) end with ret: 0
> [ 13.997623] == blk_trace_ioctl(2, BLKTRACESETUP) end
>
> The BLKTRACESETUP above works on request_queue which later
> LOOP_CTL_DEL races on and sweeps the debugfs dir underneath us.
> If you use this commit alone though, this doesn't fix the race issue
> however, and that's because of both still the debugfs_lookup() use
> and that we're still using asynchronous removal at this point.
>
> refcounting will just ensure we don't take the request_queue underneath
> our noses.
I think the above trace reveals a bug in the loop driver. The loop
driver shouldn't allow the associated request queue to disappear while
the loop device is open. One may want to have a look at sd_open() in the
sd driver. The scsi_disk_get() call in that function not only increases
the reference count of the SCSI disk but also of the underlying SCSI device.
Thanks,
Bart.
Powered by blists - more mailing lists