[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFj5m9Jvw=3KPGYyChJu5nxraxwKm04kpMak5Ns+pjkHepsN-Q@mail.gmail.com>
Date: Wed, 6 Nov 2024 11:58:52 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Yu Kuai <yukuai1@...weicloud.com>
Cc: Christoph Hellwig <hch@...radead.org>, axboe@...nel.dk, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, yi.zhang@...wei.com, yangerkun@...wei.com,
"yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH -next] block: fix uaf for flush rq while iterating tags
On Wed, Nov 6, 2024 at 11:52 AM Yu Kuai <yukuai1@...weicloud.com> wrote:
>
> Hi,
>
> 在 2024/11/06 11:36, Ming Lei 写道:
> > On Wed, Nov 06, 2024 at 11:25:07AM +0800, Yu Kuai wrote:
> >> Hi,
> >>
> >> 在 2024/11/06 11:11, Ming Lei 写道:
> >>> On Wed, Nov 06, 2024 at 10:58:40AM +0800, Yu Kuai wrote:
> >>>> Hi,Ming and Christoph
> >>>>
> >>>> 在 2024/11/05 19:19, Christoph Hellwig 写道:
> >>>>> On Mon, Nov 04, 2024 at 07:00:05PM +0800, Yu Kuai wrote:
> >>>>>> From: Yu Kuai <yukuai3@...wei.com>
> >>>>>>
> >>>>>> blk_mq_clear_flush_rq_mapping() is not called during scsi probe, by
> >>>>>> checking blk_queue_init_done(). However, QUEUE_FLAG_INIT_DONE is cleared
> >>>>>> in del_gendisk by commit aec89dc5d421 ("block: keep q_usage_counter in
> >>>>>> atomic mode after del_gendisk"), hence for disk like scsi, following
> >>>>>> blk_mq_destroy_queue() will not clear flush rq from tags->rqs[] as well,
> >>>>>> cause following uaf that is found by our syzkaller for v6.6:
> >>>>>
> >>>>> Which means we leave the flush request lingering after del_gendisk,
> >>>>> which sounds like the real bug. I suspect we just need to move the
> >>>>> call to blk_mq_clear_flush_rq_mapping so that it is called from
> >>>>> del_gendisk and doesn't leave the flush tag lingering around.
> >>>>>
> >>>>
> >>>> This remind me that del_gendisk is still too late to do that. Noted that
> >>>> flush_rq can acquire different tags, so if the multiple flush_rq is done
> >>>> and those tags are not reused, the flush_rq can exist in multiple
> >>>> entries in tags->rqs[]. The consequence I can think of is that iterating
> >>>> tags can found the same flush_rq multiple times, and the flush_rq can be
> >>>> inflight.
> >>>
> >>> How can that be one problem?
> >>>
> >>> Please look at
> >>>
> >>> commit 364b61818f65 ("blk-mq: clearing flush request reference in tags->rqs[]")
> >>> commit bd63141d585b ("blk-mq: clear stale request in tags->rq[] before freeing one request pool")
> >>>
> >>> and understand the motivation.
> >>>
> >>> That also means it is just fine to delay blk_mq_clear_flush_rq_mapping()
> >>> after disk is deleted.
> >>
> >> I do understand what you mean. Let's see if you want this to be avoided,
> >> for example(no disk is deleted):
> >
> > It is definitely another issue, and not supposed to be covered by
> > blk_mq_clear_flush_rq_mapping().
> >
> >>
> >> 1) issue a flush, and tag 0 is used, after the flush is done,
> >> tags->rqs[0] = flush_rq
> >> 2) issue another flush, and tag 1 is used, after the flush is done,
> >> tags->rqs[1] = flush_rq
> >> 3) issue a flush again, and tag 2 is used, and the flush_rq is
> >> dispatched to disk;
> >
> > Yes, this kind of thing exists since blk-mq begins, and you can't expect
> > blk_mq_in_flight_rw() to get accurate inflight requests.
> >
> >> 4) Then in this case, blk_mq_in_flight_rw() will found the same flush_rq
> >> 3 times and think there are 3 inflight request, same for
> >> hctx_busy_show()...
> >
> > But we have tried to avoid it, such as in blk_mq_find_and_get_req()
> > req->tag is checked against the current 'bitnr' of sbitmap when walking
> > over tags. Then the same flush_rq won't be counted 3 times.
>
> Yes, I missed that req->tag is checked. Looks like there is no porblem
> now. :)
>
> Just to make sure, we're still on the same page for this patch, right?
Yeah, my reviewed-by still stands, and it is fine to delay to clear flush req
wrt. the original motivation, what matters is that it is cleared before the
flush req is freed.
Thanks,
Powered by blists - more mailing lists