[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0206ea4f-4efd-b7d0-088a-9257d06dcffb@huaweicloud.com>
Date: Wed, 20 Aug 2025 08:56:52 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Nilay Shroff <nilay@...ux.ibm.com>, Yu Kuai <yukuai1@...weicloud.com>,
axboe@...nel.dk, bvanassche@....org, ming.lei@...hat.com, hare@...e.de
Cc: linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
yi.zhang@...wei.com, yangerkun@...wei.com, johnny.chenyi@...wei.com,
"yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH v2 1/2] blk-mq: fix elevator depth_updated method
Hi,
在 2025/08/19 20:20, Nilay Shroff 写道:
>
>
> On 8/19/25 6:59 AM, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> Current depth_updated has some problems:
>>
>> 1) depth_updated() will be called for each hctx, while all elevators
>> will update async_depth for the disk level, this is not related to hctx;
>> 2) In blk_mq_update_nr_requests(), if previous hctx update succeed and
>> this hctx update failed, q->nr_requests will not be updated, while
>> async_depth is already updated with new nr_reqeuests in previous
>> depth_updated();
>> 3) All elevators are using q->nr_requests to calculate async_depth now,
>> however, q->nr_requests is still the old value when depth_updated() is
>> called from blk_mq_update_nr_requests();
>>
>> Fix those problems by:
>>
>> - pass in request_queue instead of hctx;
>> - move depth_updated() after q->nr_requests is updated in
>> blk_mq_update_nr_requests();
>> - add depth_updated() call in blk_mq_init_sched();
>> - remove init_hctx() method for mq-deadline and bfq that is useless now;
>>
>> Fixes: 77f1e0a52d26 ("bfq: update internal depth state when queue depth changes")
>> Fixes: 39823b47bbd4 ("block/mq-deadline: Fix the tag reservation code")
>> Fixes: 42e6c6ce03fd ("lib/sbitmap: convert shallow_depth from one word to the whole sbitmap")
>> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
>> ---
>> block/bfq-iosched.c | 21 ++++-----------------
>> block/blk-mq-sched.c | 3 +++
>> block/blk-mq-sched.h | 11 +++++++++++
>> block/blk-mq.c | 23 ++++++++++++-----------
>> block/elevator.h | 2 +-
>> block/kyber-iosched.c | 10 ++++------
>> block/mq-deadline.c | 15 ++-------------
>> 7 files changed, 37 insertions(+), 48 deletions(-)
>>
>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>> index 50e51047e1fe..c0c398998aa1 100644
>> --- a/block/bfq-iosched.c
>> +++ b/block/bfq-iosched.c
>> @@ -7109,9 +7109,10 @@ void bfq_put_async_queues(struct bfq_data *bfqd, struct bfq_group *bfqg)
>> * See the comments on bfq_limit_depth for the purpose of
>> * the depths set in the function. Return minimum shallow depth we'll use.
>> */
>> -static void bfq_update_depths(struct bfq_data *bfqd, struct sbitmap_queue *bt)
>> +static void bfq_depth_updated(struct request_queue *q)
>> {
>> - unsigned int nr_requests = bfqd->queue->nr_requests;
>> + struct bfq_data *bfqd = q->elevator->elevator_data;
>> + unsigned int nr_requests = q->nr_requests;
>>
>> /*
>> * In-word depths if no bfq_queue is being weight-raised:
>> @@ -7143,21 +7144,8 @@ static void bfq_update_depths(struct bfq_data *bfqd, struct sbitmap_queue *bt)
>> bfqd->async_depths[1][0] = max((nr_requests * 3) >> 4, 1U);
>> /* no more than ~37% of tags for sync writes (~20% extra tags) */
>> bfqd->async_depths[1][1] = max((nr_requests * 6) >> 4, 1U);
>> -}
>> -
>> -static void bfq_depth_updated(struct blk_mq_hw_ctx *hctx)
>> -{
>> - struct bfq_data *bfqd = hctx->queue->elevator->elevator_data;
>> - struct blk_mq_tags *tags = hctx->sched_tags;
>>
>> - bfq_update_depths(bfqd, &tags->bitmap_tags);
>> - sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, 1);
>> -}
>> -
>> -static int bfq_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int index)
>> -{
>> - bfq_depth_updated(hctx);
>> - return 0;
>> + blk_mq_set_min_shallow_depth(q, 1);
>> }
>>
>> static void bfq_exit_queue(struct elevator_queue *e)
>> @@ -7628,7 +7616,6 @@ static struct elevator_type iosched_bfq_mq = {
>> .request_merged = bfq_request_merged,
>> .has_work = bfq_has_work,
>> .depth_updated = bfq_depth_updated,
>> - .init_hctx = bfq_init_hctx,
>> .init_sched = bfq_init_queue,
>> .exit_sched = bfq_exit_queue,
>> },
>> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
>> index e2ce4a28e6c9..bf7dd97422ec 100644
>> --- a/block/blk-mq-sched.c
>> +++ b/block/blk-mq-sched.c
>> @@ -585,6 +585,9 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e,
>> }
>> }
>> }
>> +
>> + if (e->ops.depth_updated)
>> + e->ops.depth_updated(q);
>> return 0;
>>
>
> Overall changes look good. That said, I think it might be cleaner to structure
> it this way:
>
> elevator_switch -> blk_mq_init_sched ->init_sched ==> sets async_depth
> blk_mq_update_nr_requests ->depth_updated ==> updates async_depth
>
> This way, we don’t need to call ->depth_updated from blk_mq_init_sched.
Just to be sure, you mean calling the depth_updated method directly
inside the init_sched() method? This is indeed cleaner, each elevator
has to use this method to initialize async_dpeth.
>
> In summary:
> - Avoid calling ->depth_updated during blk_mq_init_sched
> - Set async_depth when the elevator is initialized (via ->init_sched)
> - Update async_depth when nr_requests is modified through sysfs (via ->depth_updated)
>
> Thanks,
> --Nilay
> .
>
Thanks,
Kuai
Powered by blists - more mailing lists