[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241016093633.670555-1-chenridong@huaweicloud.com>
Date: Wed, 16 Oct 2024 09:36:33 +0000
From: Chen Ridong <chenridong@...weicloud.com>
To: tj@...nel.org,
lizefan.x@...edance.com,
hannes@...xchg.org,
longman@...hat.com,
mkoutny@...e.com,
john.fastabend@...il.com,
roman.gushchin@...ux.dev,
quanyang.wang@...driver.com,
ast@...nel.org
Cc: cgroups@...r.kernel.org,
linux-kernel@...r.kernel.org,
bpf@...r.kernel.org,
chenridong@...wei.com,
wangweiyang2@...wei.com
Subject: [PATCH] cgroup/bpf: fix NULL pointer dereference at cgroup_bpf_offline
From: Chen Ridong <chenridong@...wei.com>
Kernel fault injection test reports NULL pointer dereference as follows:
BUG: kernel NULL pointer dereference, address: 0000000000000010
PGD 0 P4D 0
Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 7 UID: 0 PID: 1611 Comm: umount Tainted: G W 6.12.0-rc2
Tainted: [W]=WARN
RIP: 0010:__percpu_ref_switch_mode+0x30/0x320
RSP: 0018:ffffc90001f9be28 EFLAGS: 00000002
RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001
RDX: 0000000000000001 RSI: ffffffff82c12d18 RDI: ffff88810753a7d8
RBP: ffff88810f3a0888 R08: 0000000000000001 R09: 00000000000c0000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000000111edc000 CR4: 00000000000006f0
Call Trace:
percpu_ref_kill_and_confirm+0x3a/0x90
cgroup_kill_sb+0x61/0x190
deactivate_locked_super+0x35/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
syscall_exit_to_user_mode+0x1bc/0x1d0
do_syscall_64+0x74/0x140
entry_SYSCALL_64_after_hwframe+0x76/0x7e
A warning was also found in dmesg when the bug occurs,
WARNING: CPU: 5 PID: 1554 at kernel/cgroup/cgroup.c:2144
Call Trace:
? cgroup_setup_root+0x39c/0x440
cgroup1_get_tree+0x38f/0x860
? cgroup1_get_tree+0x71/0x860
vfs_get_tree+0x2c/0x100
path_mount+0x2e3/0xb90
__x64_sys_mount+0x19f/0x200
do_syscall_64+0x68/0x140
entry_SYSCALL_64_after_hwframe+0x76/0x7e
As mentioned above, when cgroup_bpf_inherit returns an error in
cgroup_setup_root, cgrp->bpf.refcnt has been exited. If cgrp->bpf.refcnt is
killed again in the cgroup_kill_sb function, the data of cgrp->bpf.refcnt
may have become NULL, leading to NULL pointer dereference.
To fix this issue, goto err when cgroup_bpf_inherit returns an error.
Additionally, if cgroup_bpf_inherit returns an error after rebinding
subsystems, the root_cgrp->self.refcnt is exited, which leads to
cgroup1_root_to_use return 1 (restart) when subsystems is mounted next.
This is due to a failure trying to get the refcnt(the root is root_cgrp,
without rebinding back to cgrp_dfl_root). So move the call to
cgroup_bpf_inherit above rebind_subsystems in the cgroup_setup_root.
Fixes: 04f8ef5643bc ("cgroup: Fix memory leak caused by missing cgroup_bpf_offline")
Signed-off-by: Chen Ridong <chenridong@...wei.com>
---
kernel/cgroup/cgroup.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 5886b95c6eae..8a0cbf95cc57 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -2136,12 +2136,13 @@ int cgroup_setup_root(struct cgroup_root *root, u16 ss_mask)
if (ret)
goto destroy_root;
- ret = rebind_subsystems(root, ss_mask);
+ ret = cgroup_bpf_inherit(root_cgrp);
if (ret)
goto exit_stats;
- ret = cgroup_bpf_inherit(root_cgrp);
- WARN_ON_ONCE(ret);
+ ret = rebind_subsystems(root, ss_mask);
+ if (ret)
+ goto exit_stats;
trace_cgroup_setup_root(root);
--
2.34.1
Powered by blists - more mailing lists