From 11036d027cc1f3dd0a6045794fb87711c840f426 Mon Sep 17 00:00:00 2001 From: Waiman Long Date: Sat, 22 Jun 2024 10:25:15 -0400 Subject: [PATCH] cgroup/cpuset: Prevent UAF in proc_cpuset_show() An UAF can happen when /proc/cpuset is read as reported in [1]. When the cpuset is initialized, the root node top_cpuset.css.cgrp will point to &cgrp_dfl_root.cgrp. In cgroup v1, the mount operation will allocate cgroup_root, and top_cpuset.css.cgrp will point to the allocated &cgroup_root.cgrp. When the umount operation is executed, top_cpuset.css.cgrp will be rebound to &cgrp_dfl_root.cgrp. The problem is that when rebinding to cgrp_dfl_root, there are cases where the cgroup_root allocated by setting up the root for cgroup v1 is cached. This could lead to a Use-After-Free (UAF) if it is subsequently freed. The descendant cgroups of cgroup v1 can only be freed after the css is released. However, the css of the root will never be released, yet the cgroup_root should be freed when it is unmounted. This means that obtaining a reference to the css of the root does not guarantee that css.cgrp->root will not be freed. Fix this problem by taking a reference to the v1 cgroup root in cpuset_bind() and release it in the next cpuset_bind() call. The top_cpuset will always be bound to either cgrp_dfl_root or the allocated v1 cgroup root. So top_cpuset will always be remounted back to cgrp_dfl_root whenever a v1 cpuset mount is released. Access to css->cgroup in proc_cpuset_show() is now protected under the cpuset_mutex to make sure that an UAF access to css->cgroup is not possible. [1] https://syzkaller.appspot.com/bug?extid=9b1ff7be974a403aa4cd Reported-by: Chen Ridong Closes: https://syzkaller.appspot.com/bug?extid=9b1ff7be974a403aa4cd Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index c12b9fdb22a4..8155ad9ff927 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -4143,9 +4143,20 @@ static void cpuset_css_free(struct cgroup_subsys_state *css) free_cpuset(cs); } +/* + * With a cgroup v1 mount, root_css.cgroup can be freed. We need to take a + * reference to it to avoid UAF as proc_cpuset_show() may access the content + * of this cgroup. + */ static void cpuset_bind(struct cgroup_subsys_state *root_css) { + static struct cgroup *v1_cgroup_root; + mutex_lock(&cpuset_mutex); + if (v1_cgroup_root) { + cgroup_put(v1_cgroup_root); + v1_cgroup_root = NULL; + } spin_lock_irq(&callback_lock); if (is_in_v2_mode()) { @@ -4159,6 +4170,10 @@ static void cpuset_bind(struct cgroup_subsys_state *root_css) } spin_unlock_irq(&callback_lock); + if (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys)) { + v1_cgroup_root = root_css->cgroup; + cgroup_get(v1_cgroup_root); + } mutex_unlock(&cpuset_mutex); } @@ -5051,10 +5066,12 @@ int proc_cpuset_show(struct seq_file *m, struct pid_namespace *ns, if (!buf) goto out; + mutex_lock(&cpuset_mutex); css = task_get_css(tsk, cpuset_cgrp_id); retval = cgroup_path_ns(css->cgroup, buf, PATH_MAX, current->nsproxy->cgroup_ns); css_put(css); + mutex_unlock(&cpuset_mutex); if (retval == -E2BIG) retval = -ENAMETOOLONG; if (retval < 0) -- 2.39.3