[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <abde1955-d634-29d4-d229-df8c6ebdc582@huaweicloud.com>
Date: Fri, 15 Aug 2025 17:05:34 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Ming Lei <ming.lei@...hat.com>, Yu Kuai <yukuai1@...weicloud.com>
Cc: axboe@...nel.dk, hare@...e.de, nilay@...ux.ibm.com,
linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
yi.zhang@...wei.com, yangerkun@...wei.com, johnny.chenyi@...wei.com,
"yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH 00/10] blk-mq: fix blk_mq_tags double free while
nr_requests grown
Hi,
在 2025/08/15 16:30, Ming Lei 写道:
> On Fri, Aug 15, 2025 at 04:02:06PM +0800, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> In the case user trigger tags grow by queue sysfs attribute nr_requests,
>> hctx->sched_tags will be freed directly and replaced with a new
>> allocated tags, see blk_mq_tag_update_depth().
>>
>> The problem is that hctx->sched_tags is from elevator->et->tags, while
>> et->tags is still the freed tags, hence later elevator exist will try to
>> free the tags again, causing kernel panic.
>>
>> patch 1-6 are prep cleanup and refactor patches for updating nr_requests
>> patch 7,8 are the fix patches for the regression
>> patch 9 is cleanup patch after patch 8
>> patch 10 fix the stale nr_requests documentation
>
> Please do not mix bug(regression) fix with cleanup.
>
> The bug fix for updating nr_requests should have been simple enough in single
> or two patches, why do you make 10-patches for dealing with the regression?
Ok, in short, my solution is:
- serialize switching elevator with updating nr_requests
- check the case that nr_requests will grow and allocate elevator_tags
before freezing the queue.
- for the grow case, switch to new elevator_tags.
I do tried and I can't find a easy way to fix this without making
related code uncomfortable. Perhaps because I do the cleanups and
refactor first and I can't think outside the box...
>
> Not mention this way is really unfriendly for stable tree backport.
I checked the last time related code to queue_requests_store() was
changed is commit 3efe7571c3ae ("block: protect nr_requests update using
q->elevator_lock"), and I believe this is what the fixed patch relied
on, so I think backport will not have much conflicts.
Whatever stbale branch that f5a6604f7a44 ("block: fix lockdep warning
caused by lock dependency in elv_iosched_store") is backported, I can
make sure a proper fix is backported as well.
Thanks,
Kuai
>
>
> Thanks,
> Ming
>
>
> .
>
Powered by blists - more mailing lists