linux-kernel - Re: [PATCH 2/3] blk-cgroup: fix uaf in blkcg_activate_policy() racing with blkg_free

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <72dc0ed6-1ba9-46b8-a43f-d11c32e2f341@fnnas.com>
Date: Fri, 9 Jan 2026 00:11:40 +0800
From: "Yu Kuai" <yukuai@...as.com>
To: "Zheng Qixing" <zhengqixing@...weicloud.com>, <tj@...nel.org>, 
	<josef@...icpanda.com>, <axboe@...nel.dk>
Cc: <cgroups@...r.kernel.org>, <linux-block@...r.kernel.org>, 
	<linux-kernel@...r.kernel.org>, <yi.zhang@...wei.com>, 
	<yangerkun@...wei.com>, <houtao1@...wei.com>, <zhengqixing@...wei.com>, 
	<yukuai@...as.com>
Subject: Re: [PATCH 2/3] blk-cgroup: fix uaf in blkcg_activate_policy() racing with blkg_free_workfn()

Hi,

在 2026/1/8 9:44, Zheng Qixing 写道:
> From: Zheng Qixing <zhengqixing@...wei.com>
>
> When switching IO schedulers on a block device (e.g., loop0),
> blkcg_activate_policy() is called to allocate blkg_policy_data (pd)
> for all blkgs associated with that device's request queue.
>
> However, a race condition exists between blkcg_activate_policy() and
> concurrent blkcg deletion that leads to a use-after-free:
>
> T1 (blkcg_activate_policy):
>    - Successfully allocates pd for blkg1 (loop0->queue, blkcgA)
>    - Fails to allocate pd for blkg2 (loop0->queue, blkcgB)
>    - Goes to enomem error path to rollback blkg1's resources
>
> T2 (blkcg deletion):
>    - blkcgA is being deleted concurrently
>    - blkg1 is freed via blkg_free_workfn()
>    - blkg1->pd is freed
>
> T1 (continued):
>    - In the rollback path, accesses pd->online after blkg1->pd
>      has been freed
>    - Triggers use-after-free
>
> The issue occurs because blkcg_activate_policy() doesn't hold
> adequate protection against concurrent blkg freeing during the
> error rollback path. The call trace is as follows:
>
> ==================================================================
> BUG: KASAN: slab-use-after-free in blkcg_activate_policy+0x516/0x5f0
> Read of size 1 at addr ffff88802e1bc00c by task sh/7357
> CPU: 1 PID: 7357 Comm: sh Tainted: G           OE       6.6.0+ #1
> Call Trace:
>   <TASK>
>   blkcg_activate_policy+0x516/0x5f0
>   bfq_create_group_hierarchy+0x31/0x90
>   bfq_init_queue+0x6df/0x8e0
>   blk_mq_init_sched+0x290/0x3a0
>   elevator_switch+0x8a/0x190
>   elv_iosched_store+0x1f7/0x2a0
>   queue_attr_store+0xad/0xf0
>   kernfs_fop_write_iter+0x1ee/0x2e0
>   new_sync_write+0x154/0x260
>   vfs_write+0x313/0x3c0
>   ksys_write+0xbd/0x160
>   do_syscall_64+0x55/0x100
>   entry_SYSCALL_64_after_hwframe+0x78/0xe2
>
> Allocated by task 7357:
>   bfq_pd_alloc+0x6e/0x120
>   blkcg_activate_policy+0x141/0x5f0
>   bfq_create_group_hierarchy+0x31/0x90
>   bfq_init_queue+0x6df/0x8e0
>   blk_mq_init_sched+0x290/0x3a0
>   elevator_switch+0x8a/0x190
>   elv_iosched_store+0x1f7/0x2a0
>   queue_attr_store+0xad/0xf0
>   kernfs_fop_write_iter+0x1ee/0x2e0
>   new_sync_write+0x154/0x260
>   vfs_write+0x313/0x3c0
>   ksys_write+0xbd/0x160
>   do_syscall_64+0x55/0x100
>   entry_SYSCALL_64_after_hwframe+0x78/0xe2
>
> Freed by task 14318:
>   blkg_free_workfn+0x7f/0x200
>   process_one_work+0x2ef/0x5d0
>   worker_thread+0x38d/0x4f0
>   kthread+0x156/0x190
>   ret_from_fork+0x2d/0x50
>   ret_from_fork_asm+0x1b/0x30
>
> Fix this bug by adding q->blkcg_mutex in the enomem branch of
> blkcg_activate_policy().
>
> Fixes: f1c006f1c685 ("blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()")
> Signed-off-by: Zheng Qixing <zhengqixing@...wei.com>
> ---
>   block/blk-cgroup.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index 5e1a724a799a..af468676cad1 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -1693,9 +1693,11 @@ int blkcg_activate_policy(struct gendisk *disk, const struct blkcg_policy *pol)
>   
>   enomem:
>   	/* alloc failed, take down everything */
> +	mutex_lock(&q->blkcg_mutex);
>   	spin_lock_irq(&q->queue_lock);
>   	blkcg_policy_teardown_pds(q, pol);
>   	spin_unlock_irq(&q->queue_lock);
> +	mutex_unlock(&q->blkcg_mutex);

This looks correct, however, I think it's better also to protect q->blkg_list iteration from
blkcg_activate_policy() and blkg_destroys_all() as well. This way all the q->blkg_list access
will be protected by blkcg_mutex, and it'll be easier to convert protecting blkg from queue_lock
to blkcg_mutex.

>   	ret = -ENOMEM;
>   	goto out;
>   }

-- 
Thansk,
Kuai