linux-kernel - Re: [PATCH v3 1/5] blk-mq-sched: introduce high level elevator lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <190e08e9-9b18-9f81-9a6c-0a0d3b996398@huaweicloud.com>
Date: Mon, 11 Aug 2025 14:17:38 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Damien Le Moal <dlemoal@...nel.org>, Yu Kuai <yukuai1@...weicloud.com>,
 hare@...e.de, jack@...e.cz, bvanassche@....org, tj@...nel.org,
 josef@...icpanda.com, axboe@...nel.dk
Cc: cgroups@...r.kernel.org, linux-block@...r.kernel.org,
 linux-kernel@...r.kernel.org, yi.zhang@...wei.com, yangerkun@...wei.com,
 johnny.chenyi@...wei.com, "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH v3 1/5] blk-mq-sched: introduce high level elevator lock

Hi,

在 2025/08/11 12:34, Damien Le Moal 写道:
> On 8/11/25 13:25, Yu Kuai wrote:
>> Hi,
>>
>> 在 2025/08/11 11:53, Damien Le Moal 写道:
>>> On 8/11/25 10:01, Yu Kuai wrote:
>>>>>> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
>>>>>> index 55a0fd105147..1a2da5edbe13 100644
>>>>>> --- a/block/blk-mq-sched.c
>>>>>> +++ b/block/blk-mq-sched.c
>>>>>> @@ -113,7 +113,14 @@ static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
>>>>>>     		if (budget_token < 0)
>>>>>>     			break;
>>>>>>     
>>>>>> -		rq = e->type->ops.dispatch_request(hctx);
>>>>>> +		if (blk_queue_sq_sched(q)) {
>>>>>> +			elevator_lock(e);
>>>>>> +			rq = e->type->ops.dispatch_request(hctx);
>>>>>> +			elevator_unlock(e);
>>>>>
>>>>> I do not think this is safe for bfq since bfq uses the irqsave/irqrestore spin
>>>>> lock variant. If it is safe, this needs a big comment block explaining why
>>>>> and/or the rules regarding the scheduler use of this lock.
>>>>
>>>> It's correct, however, this patch doesn't change bfq yet, and it's like:
>>>>
>>>> elevator_lock
>>>> spin_lock_irq(&bfqd->lock)
>>>> spin_unlock_irq(&bfqd->lock)
>>>> elevator_unlock
>>>>
>>>> Patch 3 remove bfqd->lock and convert this to:
>>>>
>>>> elevator_lock_irq
>>>> elevator_unlock_irq.
>>>
>>> I do not understand. Since q->elevator->lock is already taken here, without IRQ
>>> disabled, how can bfq_dispatch_request method again take this same lock with IRQ
>>> disabled ? That cannot possibly work.
>>
>> Looks like there is still misunderstanding somehow :( After patch 3,
>> bfq_dispatch_work doesn't grab any lock, elevator lock is held before
>> calling into dispatch method.
>>
>> Before:
>>
>> elevator_lock
>> bfq_dispatch_request
>>    spin_lock_irq(&bfqd->lock)
>>    spin_unlock_irq(&bfqd->lock)
>> elevator_unlock
>>
>> After:
>> elevator_lock_irq
>> bfq_dispatch_request
>> elevator_unlock_irq
> 
> Ah, yes, I see it now.
> 
> But that is a nasty change that affects *all* schedulers, even those that do not
> need to disable IRQs because they are not using the lock in their completion
> path, e.g. mq-deadline. So I do not think that is acceptable.
> 

OK, I did some benchmark and didn't found any performance regression, so
I decided use the irq version for deadline. Perhaps I didn't found such
test.

I can add a flag in elevator_queue->flags in addition like following to
fix this:

diff --git a/block/elevator.h b/block/elevator.h
index 81f7700b0339..f04b9b2ae344 100644
--- a/block/elevator.h
+++ b/block/elevator.h
@@ -123,6 +123,7 @@ struct elevator_queue {
  #define ELEVATOR_FLAG_REGISTERED       0
  #define ELEVATOR_FLAG_DYING            1
  #define ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT       2
+#define ELEVATOR_FLAG_LOCK_AT_COMP     3

  #define elevator_lock(e)               spin_lock(&(e)->lock)
  #define elevator_unlock(e)             spin_unlock(&(e)->lock)
@@ -134,6 +135,22 @@ struct elevator_queue {
         spin_unlock_irqrestore(&(e)->lock, flags)
  #define elevator_lock_assert_held(e)   lockdep_assert_held(&(e)->lock)

+static inline void elevator_dispatch_lock(struct elevator_queue *e)
+{
+       if (test_bit(ELEVATOR_FLAG_LOCK_AT_COMP, &e->flags))
+               elevator_lock_irq(e);
+       else
+               elevator_lock(e);
+}
+
+static inline void elevator_dispatch_unlock(struct elevator_queue *e)
+{
+       if (test_bit(ELEVATOR_FLAG_LOCK_AT_COMP, &e->flags))
+               elevator_unlock_irq(e);
+       else
+               elevator_unlock(e);
+}
+

Thanks,
Kuai