[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cc6f72cb-3782-4426-57c2-4d54fc4f38f2@huaweicloud.com>
Date: Wed, 23 Jul 2025 10:17:16 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Damien Le Moal <dlemoal@...nel.org>, Yu Kuai <yukuai1@...weicloud.com>,
hare@...e.de, tj@...nel.org, josef@...icpanda.com, axboe@...nel.dk
Cc: cgroups@...r.kernel.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, yi.zhang@...wei.com, yangerkun@...wei.com,
johnny.chenyi@...wei.com, "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH 4/6] elevator: factor elevator lock out of
dispatch_request method
Hi,
在 2025/07/23 9:59, Damien Le Moal 写道:
> On 7/22/25 4:24 PM, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> Currently, both mq-deadline and bfq have global spin lock that will be
>> grabbed inside elevator methods like dispatch_request, insert_requests,
>> and bio_merge. And the global lock is the main reason mq-deadline and
>> bfq can't scale very well.
>>
>> For dispatch_request method, current behavior is dispatching one request at
>
> s/current/the current
>
>> a time. In the case of multiple dispatching contexts, this behavior will
>> cause huge lock contention and messing up the requests dispatching
>
> s/messing up/change
>
>> order. And folloiwng patches will support requests batch dispatching to
>
> s/folloiwng/following
>
>> fix thoses problems.
>>
>> While dispatching request, blk_mq_get_disatpch_budget() and
>> blk_mq_get_driver_tag() must be called, and they are not ready to be
>> called inside elevator methods, hence introduce a new method like
>> dispatch_requests is not possible.
>>
>> In conclusion, this patch factor the global lock out of dispatch_request
>> method, and following patches will support request batch dispatch by
>> calling the methods multiple time while holding the lock.
>
> You are creating a bisect problem here. This patch breaks the schedulers
> dispatch atomicity without the changes to the calls to the elevator methods in
> the block layer.
I'm not sure why there will be bisect problem, I think git checkout to
any patch in this set should work just fine. Can you please explain a
bit more?
>
> So maybe reorganize these patches to have the block layer changes first, and
> move patch 1 and 3 after these to switch mq-deadline and bfq to using the
> higher level lock correctly, removing the locking from bfq_dispatch_request()
> and dd_dispatch_request().
Sure, I can to the reorganize.
Thanks,
Kuai
>
>>
>> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
>> ---
>> block/bfq-iosched.c | 3 ---
>> block/blk-mq-sched.c | 6 ++++++
>> block/mq-deadline.c | 5 +----
>> 3 files changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>> index 11b81b11242c..9f8a256e43f2 100644
>> --- a/block/bfq-iosched.c
>> +++ b/block/bfq-iosched.c
>> @@ -5307,8 +5307,6 @@ static struct request *bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
>> struct bfq_queue *in_serv_queue;
>> bool waiting_rq, idle_timer_disabled = false;
>>
>> - spin_lock_irq(bfqd->lock);
>> -
>> in_serv_queue = bfqd->in_service_queue;
>> waiting_rq = in_serv_queue && bfq_bfqq_wait_request(in_serv_queue);
>>
>> @@ -5318,7 +5316,6 @@ static struct request *bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
>> waiting_rq && !bfq_bfqq_wait_request(in_serv_queue);
>> }
>>
>> - spin_unlock_irq(bfqd->lock);
>> bfq_update_dispatch_stats(hctx->queue, rq,
>> idle_timer_disabled ? in_serv_queue : NULL,
>> idle_timer_disabled);
>> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
>> index 55a0fd105147..82c4f4eef9ed 100644
>> --- a/block/blk-mq-sched.c
>> +++ b/block/blk-mq-sched.c
>> @@ -98,6 +98,7 @@ static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
>> max_dispatch = hctx->queue->nr_requests;
>>
>> do {
>> + bool sq_sched = blk_queue_sq_sched(q);
>> struct request *rq;
>> int budget_token;
>>
>> @@ -113,7 +114,12 @@ static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
>> if (budget_token < 0)
>> break;
>>
>> + if (sq_sched)
>> + spin_lock_irq(&e->lock);
>> rq = e->type->ops.dispatch_request(hctx);
>> + if (sq_sched)
>> + spin_unlock_irq(&e->lock);
>> +
>> if (!rq) {
>> blk_mq_put_dispatch_budget(q, budget_token);
>> /*
>> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
>> index e31da6de7764..a008e41bc861 100644
>> --- a/block/mq-deadline.c
>> +++ b/block/mq-deadline.c
>> @@ -466,10 +466,9 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
>> struct request *rq;
>> enum dd_prio prio;
>>
>> - spin_lock(dd->lock);
>> rq = dd_dispatch_prio_aged_requests(dd, now);
>> if (rq)
>> - goto unlock;
>> + return rq;
>>
>> /*
>> * Next, dispatch requests in priority order. Ignore lower priority
>> @@ -481,8 +480,6 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
>> break;
>> }
>>
>> -unlock:
>> - spin_unlock(dd->lock);
>> return rq;
>> }
>>
>
>
Powered by blists - more mailing lists