[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1552666744.45180.135.camel@acm.org>
Date: Fri, 15 Mar 2019 09:19:04 -0700
From: Bart Van Assche <bvanassche@....org>
To: "jianchao.wang" <jianchao.w.wang@...cle.com>,
Christoph Hellwig <hch@....de>
Cc: axboe@...nel.dk, linux-block@...r.kernel.org, jsmart2021@...il.com,
josef@...icpanda.com, linux-nvme@...ts.infradead.org,
linux-kernel@...r.kernel.org, keith.busch@...el.com, hare@...e.de,
jthumshirn@...e.de, sagi@...mberg.me
Subject: Re: [PATCH 0/8]: blk-mq: use static_rqs to iterate busy tags
On Fri, 2019-03-15 at 17:44 +0800, jianchao.wang wrote:
> On 3/15/19 5:20 PM, Christoph Hellwig wrote:
> > On Fri, Mar 15, 2019 at 04:57:36PM +0800, Jianchao Wang wrote:
> > > Hi Jens
> > >
> > > As we know, there is a risk of accesing stale requests when iterate
> > > in-flight requests with tags->rqs[] and this has been talked in following
> > > thread,
> > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__marc.info_-3Fl-3Dlinux-2Dscsi-26m-3D154511693912752-26w-3D2&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zX
> > > yr4qk7sx26ATvfo6QSTvZyQ&m=CydqJPTf4FUrfs7ipUc2chm2jGuNuDVn_onIetKEehM&s=ZQ7RfO6-737-t5kQv7SFlXMhIdpwn_AxJI93d6c-nj0&e=
> > > [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__marc.info_-3Fl-3Dlinux-2Dblock-26m-3D154526189023236-26w-3D2&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7z
> > > Xyr4qk7sx26ATvfo6QSTvZyQ&m=CydqJPTf4FUrfs7ipUc2chm2jGuNuDVn_onIetKEehM&s=EBV1M5p4mE8jZ5ZD1ecU5kMbJ9EtbpVJoc7Tqolrsc8&e=
> >
> > I'd rather take one step back and figure out why we are iterating
> > the busy requests. There really shouldn't be any reason why a driver
> > is even doings that (vs some error handling helpers in the core
> > block code that can properly synchronize).
> >
>
> A typical scene is blk_mq_in_flight,
>
> blk_mq_get_request blk_mq_in_flight
> -> blk_mq_get_tag -> blk_mq_queue_tag_busy_iter
> -> bt_for_each
> -> bt_iter
> -> rq = taags->rqs[]
> -> rq->q //---> get a stale request
> -> blk_mq_rq_ctx_init
> -> data->hctx->tags->rqs[rq->tag] = rq
>
> This stale request maybe something that has been freed due to io scheduler
> is detached or a q using a shared tagset is gone.
>
> And also the blk_mq_timeout_work could use it to pick up the expired request.
> The driver would also use it to requeue the in-flight requests when the device is dead.
>
> Compared with adding more synchronization, using static_rqs[] directly maybe simpler :)
Hi Jianchao,
Although I appreciate your work: I agree with Christoph that we should avoid races
like this rather than modifying the block layer to make sure that such races are
handled safely.
Thanks,
Bart.
Powered by blists - more mailing lists