[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1552521077.45180.119.camel@acm.org>
Date: Wed, 13 Mar 2019 16:51:17 -0700
From: Bart Van Assche <bvanassche@....org>
To: Jason Yan <yanaijie@...wei.com>,
Christoph Hellwig <hch@...radead.org>
Cc: martin.petersen@...cle.com, jejb@...ux.vnet.ibm.com,
Jens Axboe <axboe@...nel.dk>, linux-scsi@...r.kernel.org,
linux-kernel@...r.kernel.org, hare@...e.com,
dan.j.williams@...el.com, jthumshirn@...e.de,
Steffen Maier <maier@...ux.ibm.com>
Subject: Re: [RFC PATCH] scsi: fix oops in scsi_uninit_cmd()
On Thu, 2019-02-21 at 16:53 +0800, Jason Yan wrote:
> On 2019/2/20 23:18, Christoph Hellwig wrote:
> > [fullquote removed, please follow proper mail etiquette]
> >
> > On Tue, Feb 19, 2019 at 08:56:28AM -0800, Bart Van Assche wrote:
> > > regression in the SCSI sd driver due to the switch from the legacy block
> > > layer to scsi-mq. The above patch introduces two atomic operations in the
> > > hot path and hence would introduce a performance regression. I think this
> > > can be avoided by making sure that sd_uninit_command() gets called before
> > > the request tag is freed. What changes would be required to make the block
> > > layer core call sd_uninit_command() before the request tag is freed? Would
> > > introducing prep_rq_fn and unprep_rq_fn callbacks in struct blk_mq_ops and
> > > making sure that the SCSI core sets these callback function pointers
> > > appropriately be sufficient? Would such a change allow to simplify the NVMe
> > > initiator driver? Are there any alternatives to this approach that are more
> > > elegant?
> >
> > Additional indirect calls in the I/O fast path is something I'd rather
> > avoid. But I don't fully understand the problem yet - where do
> > we release a disk reference from blk_update_request?
>
> When userspace close the fd after blk_update_request() and before
> scsi_mq_uninit_cmd(), a disk reference will be released. It is not the
> blk_update_request() directly released it.
>
> close
> ->sd_release
> ->scsi_disk_put
> ->scsi_disk_release
> ->disk->private_data = NULL;
>
> The userspace can close the fd because blk_update_request() returned the
> last IO , the userspace application does not have to stuck on read() or
> write(). The window is very small, but it can be reproduce every day
> in our testcases. So I'm very curious why. One possible explanation is
> that we enabled kernel preempt(CONFIG_PREEMPT).
>
> And why can't we move that release to __blk_mq_end_request?
Hi Jason,
What is the current status of this issue?
Thanks,
Bart.
Powered by blists - more mailing lists