lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <470ab442-e5eb-4fa8-bde7-d6d2d1115a5a@linux.ibm.com>
Date: Thu, 7 Aug 2025 17:17:05 +0530
From: Nilay Shroff <nilay@...ux.ibm.com>
To: Zheng Qixing <zhengqixing@...weicloud.com>, axboe@...nel.dk
Cc: linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
        yukuai3@...wei.com, yi.zhang@...wei.com, yangerkun@...wei.com,
        houtao1@...wei.com, zhengqixing@...wei.com
Subject: Re: [PATCH] block: fix kobject double initialization in add_disk



On 8/7/25 12:50 PM, Zheng Qixing wrote:
> From: Zheng Qixing <zhengqixing@...wei.com>
> 
> Device-mapper can call add_disk() multiple times for the same gendisk
> due to its two-phase creation process (dm create + dm load). This leads
> to kobject double initialization errors when the underlying iSCSI devices
> become temporarily unavailable and then reappear.
> 
> However, if the first add_disk() call fails and is retried, the queue_kobj
> gets initialized twice, causing:
> 
> kobject: kobject (ffff88810c27bb90): tried to init an initialized object,
> something is seriously wrong.
>  Call Trace:
>   <TASK>
>   dump_stack_lvl+0x5b/0x80
>   kobject_init.cold+0x43/0x51
>   blk_register_queue+0x46/0x280
>   add_disk_fwnode+0xb5/0x280
>   dm_setup_md_queue+0x194/0x1c0
>   table_load+0x297/0x2d0
>   ctl_ioctl+0x2a2/0x480
>   dm_ctl_ioctl+0xe/0x20
>   __x64_sys_ioctl+0xc7/0x110
>   do_syscall_64+0x72/0x390
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> Fix this by separating kobject initialization from sysfs registration:
>  - Initialize queue_kobj early during gendisk allocation
>  - add_disk() only adds the already-initialized kobject to sysfs
>  - del_gendisk() removes from sysfs but doesn't destroy the kobject
>  - Final cleanup happens when the disk is released
> 
> Fixes: 2bd85221a625 ("block: untangle request_queue refcounting from sysfs")
> Reported-by: Li Lingfeng <lilingfeng3@...wei.com>
> Closes: https://lore.kernel.org/all/83591d0b-2467-433c-bce0-5581298eb161@huawei.com/
> Signed-off-by: Zheng Qixing <zhengqixing@...wei.com>
> ---
>  block/blk-sysfs.c | 4 +---
>  block/blk.h       | 1 +
>  block/genhd.c     | 2 ++
>  3 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 396cded255ea..37d8654faff9 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -847,7 +847,7 @@ static void blk_queue_release(struct kobject *kobj)
>  	/* nothing to do here, all data is associated with the parent gendisk */
>  }
>  
> -static const struct kobj_type blk_queue_ktype = {
> +const struct kobj_type blk_queue_ktype = {
>  	.default_groups = blk_queue_attr_groups,
>  	.sysfs_ops	= &queue_sysfs_ops,
>  	.release	= blk_queue_release,
> @@ -875,7 +875,6 @@ int blk_register_queue(struct gendisk *disk)
>  	struct request_queue *q = disk->queue;
>  	int ret;
>  
> -	kobject_init(&disk->queue_kobj, &blk_queue_ktype);
>  	ret = kobject_add(&disk->queue_kobj, &disk_to_dev(disk)->kobj, "queue");
>  	if (ret < 0)
>  		goto out_put_queue_kobj;

If the kobject_add() fails here, then we jump to the label out_put_queue_kobj,
where we release/put disk->queue_kobj. That would decrement the kref of 
disk->queue_kobj and possibly bring it to zero. 
Next time, when we call add_disk() again without invoking kobject_init() 
(because the initialization is now moved outside add_disk()), the refcount
of disk->queue_kobj — which was previously released — would now go for a
toss. Wouldn't that lead to use-after-free or inconsistent state?

> @@ -986,5 +985,4 @@ void blk_unregister_queue(struct gendisk *disk)
>  		elevator_set_none(q);
>  
>  	blk_debugfs_remove(disk);
> -	kobject_put(&disk->queue_kobj);
>  }

I'm thinking a case where add_disk() fails after the queue is registered. 
In that case, we call blk_unregister_queue() — which would ideally put()
the disk->queue_kobj.
But if we skip that put() in blk_unregister_queue() (and that's what we do
above), and then later retry add_disk(), wouldn’t kobject_add() from 
blk_register_queue() complain loudly — since we’re trying to add a kobject
that was already added previously?

Thanks,
--Nilay


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ