[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2025081526-skeptic-cough-7fda@gregkh>
Date: Fri, 15 Aug 2025 08:11:39 +0200
From: Greg KH <gregkh@...uxfoundation.org>
To: Chen Ridong <chenridong@...weicloud.com>
Cc: tj@...nel.org, hannes@...xchg.org, mkoutny@...e.com,
peterz@...radead.org, zhouchengming@...edance.com,
linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
lujialin4@...wei.com, chenridong@...wei.com
Subject: Re: [PATCH] kernfs: Fix UAF in PSI polling when open file is released
On Fri, Aug 15, 2025 at 01:34:29AM +0000, Chen Ridong wrote:
> From: Chen Ridong <chenridong@...wei.com>
>
> A use-after-free (UAF) vulnerability was identified in the PSI (Pressure
> Stall Information) monitoring mechanism:
>
> BUG: KASAN: slab-use-after-free in psi_trigger_poll+0x3c/0x140
> Read of size 8 at addr ffff3de3d50bd308 by task systemd/1
>
> psi_trigger_poll+0x3c/0x140
> cgroup_pressure_poll+0x70/0xa0
> cgroup_file_poll+0x8c/0x100
> kernfs_fop_poll+0x11c/0x1c0
> ep_item_poll.isra.0+0x188/0x2c0
>
> Allocated by task 1:
> cgroup_file_open+0x88/0x388
> kernfs_fop_open+0x73c/0xaf0
> do_dentry_open+0x5fc/0x1200
> vfs_open+0xa0/0x3f0
> do_open+0x7e8/0xd08
> path_openat+0x2fc/0x6b0
> do_filp_open+0x174/0x368
>
> Freed by task 8462:
> cgroup_file_release+0x130/0x1f8
> kernfs_drain_open_files+0x17c/0x440
> kernfs_drain+0x2dc/0x360
> kernfs_show+0x1b8/0x288
> cgroup_file_show+0x150/0x268
> cgroup_pressure_write+0x1dc/0x340
> cgroup_file_write+0x274/0x548
>
> Reproduction Steps:
> 1. Open test/cpu.pressure and establish epoll monitoring
> 2. Disable monitoring: echo 0 > test/cgroup.pressure
> 3. Re-enable monitoring: echo 1 > test/cgroup.pressure
>
> The race condition occurs because:
> 1. When cgroup.pressure is disabled (echo 0 > cgroup.pressure), it:
> - Releases PSI triggers via cgroup_file_release()
> - Frees of->priv through kernfs_drain_open_files()
> 2. While epoll still holds reference to the file and continues polling
> 3. Re-enabling (echo 1 > cgroup.pressure) accesses freed of->priv
>
> epolling disable/enable cgroup.pressure
> fd=open(cpu.pressure)
> while(1)
> ...
> epoll_wait
> kernfs_fop_poll
> kernfs_get_active = true echo 0 > cgroup.pressure
> ... cgroup_file_show
> kernfs_show
> // inactive kn
> kernfs_drain_open_files
> cft->release(of);
> kfree(ctx);
> ...
> kernfs_get_active = false
> echo 1 > cgroup.pressure
> kernfs_show
> kernfs_activate_one(kn);
> kernfs_fop_poll
> kernfs_get_active = true
> cgroup_file_poll
> psi_trigger_poll
> // UAF
> ...
> end: close(fd)
>
> Fix this by adding released flag check in kernfs_fop_poll(), which is
> treated as kn is inactive.
>
> Fixes: 34f26a15611a ("sched/psi: Per-cgroup PSI accounting disable/re-enable interface")
> Reported-by: Zhang Zhantian <zhangzhaotian@...wei.com>
> Signed-off-by: Chen Ridong <chenridong@...wei.com>
> ---
> fs/kernfs/file.c | 2 +-
> kernel/cgroup/cgroup.c | 1 +
> 2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
> index a6c692cac616..d5d01f0b9392 100644
> --- a/fs/kernfs/file.c
> +++ b/fs/kernfs/file.c
> @@ -852,7 +852,7 @@ static __poll_t kernfs_fop_poll(struct file *filp, poll_table *wait)
> struct kernfs_node *kn = kernfs_dentry_node(filp->f_path.dentry);
> __poll_t ret;
>
> - if (!kernfs_get_active(kn))
> + if (of->released || !kernfs_get_active(kn))
I can see why the cgroup change is needed, but why is this test for
released() an issue here? This feels like two different changes/fixes
for different problems? Why does testing for released matter at this
point in time?
thanks,
greg k-h
Powered by blists - more mailing lists