lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <602194b5-e671-5a1e-65cb-0f770c5efb6a@huaweicloud.com>
Date: Fri, 5 Sep 2025 15:19:17 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Han Guangjiang <gj.han@...mail.com>, yukuai1@...weicloud.com
Cc: axboe@...nel.dk, fanggeng@...iang.com, hanguangjiang@...iang.com,
 liangjie@...iang.com, linux-block@...r.kernel.org,
 linux-kernel@...r.kernel.org, yangchen11@...iang.com,
 "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH] blk-throttle: check policy bit in blk_throtl_activated()

Hi,

在 2025/09/05 14:53, Han Guangjiang 写道:
> Hi,
> 
>>>>>     static inline bool blk_throtl_activated(struct request_queue *q)
>>>>>     {
>>>>> -        return q->td != NULL;
>>>>> +        return q->td != NULL &&
>>>>> +               test_bit(blkcg_policy_throtl.plid, q->blkcg_pols);
>>>>>     }
>>>> You can just remove the fist checking, p->td must be set if policy is
>>>> enabled. And please make blkcg_policy_enabled() inline function in
>>>> blk-cgroup.h and use it here.
>>> We intentionally kept the q->td != NULL check because we cannot guarantee
>>> that the policy module is fully loaded when this function is called.
>>> If the policy module is not loaded yet, blkcg_policy_throtl.plid might not be
>>> assigned, which could cause the test_bit() check to be incorrect.
>>>
>>> By keeping this check, we ensure that we have at least reached the cgroup
>>> configuration flow, indicating that the policy loading is complete.
>>>
>>> I'm wondering if there are any risks here and whether we should remove
>>> the q->td != NULL check?
>>
>> I think there is none. blk-throttle can't be build as module, if config is n,
>> blk_throtl_bio() is a non-function, if config is y, policy is registered during
>> init. And throtl_init() failure will panic, noted blkcg_policy_register() will
>> never fail for blk-throttle. BTW, policy pid is not a dynamic value at runtime.
> 
> I understand your point, but I think there's still a potential race
> condition during kernel initialization. While blk-throttle is built as a
> built-in module, block devices can also be built as built-in modules, and
> at the same initcall level, the loading order may be unpredictable.
> Additionally, the policy plid is allocated during blk-throttle module
> initialization, so we need to verify this timing issue.
> 
> I conducted an experiment on the qemu platform by adding a BUG() statement
> in the blk_throtl_activated() function:
> 
> ------------[ cut here ]------------
> kernel BUG at block/blk-throttle.h:157!
> Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT_RT SMP
> Modules linked in:
> CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.17-g69f6c99f1c48 #2
> Hardware name: linux,dummy-virt (DT)
> pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : submit_bio_noacct+0xec/0x388
> lr : submit_bio_noacct+0x40/0x388
> sp : ffff80008135b530
> x29: ffff80008135b530 x28: ffff00000180f000 x27: fffffdffc00876c0
> x26: 0000000000000040 x25: ffff800080a405f0 x24: 0000000000000004
> x23: ffff800080393128 x22: ffff000002720000 x21: ffff0000018c0700
> x20: 0000000000000000 x19: ffff00000207b180 x18: 0000000000000020
> x17: 0000000000000002 x16: 0000000000000002 x15: ffffffffffffffff
> x14: 0000000000000000 x13: ffff800080dff350 x12: ffffffffffffffff
> x11: 0000000000000000 x10: 0000000000000020 x9 : ffff8000804cd088
> x8 : ffff00000180f088 x7 : 0000000000000000 x6 : ffff00000207b1f8
> x5 : 0000000000000008 x4 : ffff0000018c0700 x3 : ffff7fffdee71000
> x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
> Call trace:
>   submit_bio_noacct+0xec/0x388
>   submit_bio+0xb4/0x150
>   submit_bh_wbc+0x14c/0x1d0
>   block_read_full_folio+0x25c/0x398
>   blkdev_read_folio+0x24/0x38
>   filemap_read_folio+0x34/0xf0
>   do_read_cache_folio+0x150/0x290
>   read_cache_folio+0x1c/0x30
>   read_part_sector+0x48/0xd0
>   read_lba+0x9c/0x180
>   efi_partition+0xa0/0x780
>   bdev_disk_changed+0x238/0x608
>   blkdev_get_whole+0xac/0x150
>   bdev_open+0x28c/0x418
>   bdev_file_open_by_dev+0xe0/0x150
>   disk_scan_partitions+0x74/0x160
>   device_add_disk+0x3f4/0x448
>   virtblk_probe+0x74c/0x920
>   virtio_dev_probe+0x1a4/0x250
>   really_probe+0xb4/0x2b0
>   __driver_probe_device+0x80/0x140
>   driver_probe_device+0xdc/0x178
>   __driver_attach+0x9c/0x1b8
>   bus_for_each_dev+0x7c/0xe8
>   driver_attach+0x2c/0x40
>   bus_add_driver+0xec/0x218
>   driver_register+0x68/0x138
>   __register_virtio_driver+0x2c/0x50
>   virtio_blk_init+0x74/0xd0
>   do_one_initcall+0x64/0x290
>   kernel_init_freeable+0x214/0x408
>   kernel_init+0x2c/0x1f0
>   ret_from_fork+0x10/0x20
> 
>>>From the experimental results, we can see that virtio block device is 
> executing blk_throtl_activated() during initialization in do_one_initcall()
> when loading virio-blk module_init().
> 
> Combined with the information that blk-throttle is also loaded as a
> built-in module, there exists a scenario where the block device probes
> first, at which point the plid has not been allocated yet, and
> blk_throtl_activated() is executed, followed by the loading of the
> blk-throttle built-in module.
> 
> Given this experimental evidence, I'm wondering if we should consider
> keeping the q->td != NULL check as a safeguard against this initialization
> race condition. I'd appreciate your thoughts on this matter.
> 

Yes, this make sense, thanks for the explanation. Please add comments
about this as well.

Thanks,
Kuai

> Thanks,
> Han Guangjiang
> 
> .
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ