[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <50d63ff3-97ef-499f-961d-cf6766a8028b@fnnas.com>
Date: Thu, 15 Jan 2026 13:24:33 +0800
From: "Yu Kuai" <yukuai@...as.com>
To: "Zheng Qixing" <zhengqixing@...weicloud.com>, <tj@...nel.org>,
<josef@...icpanda.com>, <axboe@...nel.dk>, <hch@...radead.org>
Cc: <cgroups@...r.kernel.org>, <linux-block@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <mkoutny@...e.com>,
<yi.zhang@...wei.com>, <yangerkun@...wei.com>, <houtao1@...wei.com>,
<zhengqixing@...wei.com>, <yukuai@...as.com>
Subject: Re: [PATCH v2 2/3] blk-cgroup: skip dying blkg in blkcg_activate_policy()
Hi,
在 2026/1/13 14:10, Zheng Qixing 写道:
> From: Zheng Qixing <zhengqixing@...wei.com>
>
> When switching IO schedulers on a block device, blkcg_activate_policy()
> can race with concurrent blkcg deletion, leading to a use-after-free in
> rcu_accelerate_cbs.
>
> T1: T2:
> blkg_destroy
> kill(&blkg->refcnt) // blkg->refcnt=1->0
> blkg_release // call_rcu(__blkg_release)
> ...
> blkg_free_workfn
> ->pd_free_fn(pd)
> elv_iosched_store
> elevator_switch
> ...
> iterate blkg list
> blkg_get(blkg) // blkg->refcnt=0->1
> list_del_init(&blkg->q_node)
> blkg_put(pinned_blkg) // blkg->refcnt=1->0
> blkg_release // call_rcu again
> rcu_accelerate_cbs // uaf
>
> Fix this by replacing blkg_get() with blkg_tryget(), which fails if
> the blkg's refcount has already reached zero. If blkg_tryget() fails,
> skip processing this blkg since it's already being destroyed.
>
> Link: https://lore.kernel.org/all/20260108014416.3656493-4-zhengqixing@huaweicloud.com/
> Fixes: f1c006f1c685 ("blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()")
> Signed-off-by: Zheng Qixing <zhengqixing@...wei.com>
> Reviewed-by: Christoph Hellwig <hch@....de>
> ---
> block/blk-cgroup.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index 600f8c5843ea..5dbc107eec53 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -1622,9 +1622,10 @@ int blkcg_activate_policy(struct gendisk *disk, const struct blkcg_policy *pol)
> * GFP_NOWAIT failed. Free the existing one and
> * prealloc for @blkg w/ GFP_KERNEL.
> */
> + if (!blkg_tryget(blkg))
> + continue;
So, why this check is still before the pd_alloc_fn()?
See blkg_destroy(), can you replace this by the same checking:
list_for_each_entry_reverse()
if (hlist_unhashed(&blkg->blkcg_node))
continue;
if (blkg->pd[pol->plid])
continue;
> if (pinned_blkg)
> blkg_put(pinned_blkg);
> - blkg_get(blkg);
> pinned_blkg = blkg;
>
> spin_unlock_irq(&q->queue_lock);
--
Thansk,
Kuai
Powered by blists - more mailing lists