[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220905170944.23071-1-mkoutny@suse.com>
Date: Mon, 5 Sep 2022 19:09:44 +0200
From: Michal Koutný <mkoutny@...e.com>
To: cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Cc: Tejun Heo <tj@...nel.org>, Zefan Li <lizefan.x@...edance.com>,
Johannes Weiner <hannes@...xchg.org>,
Dan Carpenter <dan.carpenter@...cle.com>
Subject: [PATCH] cgroup: Reorganize css_set_lock and kernfs path processing
The commit 74e4b956eb1c incorrectly wrapped kernfs_walk_and_get
(might_sleep) under css_set_lock (spinlock). css_set_lock is needed by
__cset_cgroup_from_root to ensure stable cset->cgrp_links. The returned
cgroup object is pinned by the css_set (*).
Because current cannot switch namespace asynchronously, the css_set is
also pinned by ns_proxy->cgroup_ns (regardless of current's cgroup
migration).
Kernfs code that traverses paths with relative root_cgroup not need
css_set_lock.
(*) Except for root cgroups. The default hierarchy root (under which
cgroup id and path resolution only happens) is eternal so it's moot.
cgroup_show_path (VFS callback) is expected to be synchronized (**) wrt
kill_sb (VFS callback) (mnt_namespace.list with namespace_sem).
(**) If not, it's still an independent issue from this and the fixed one.
Fixes: 74e4b956eb1c: ("cgroup: Honor caller's cgroup NS when resolving path")
Reported-by: Dan Carpenter <dan.carpenter@...cle.com>
Signed-off-by: Michal Koutný <mkoutny@...e.com>
---
kernel/cgroup/cgroup.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
I considered adding get_cgroup() into current_cgns_cgroup_from_root to
avoid reliance on the transitive pinning via css_set.
After reasoning about no asynchronous NS switch and v1 hiearchies kill_sb it
didn't seem to bring that much benefit (it didn't compose well with
BUG_ON(!cgrp) neither).
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index e0b72eb5d283..8c9497f01332 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1391,11 +1391,16 @@ static void cgroup_destroy_root(struct cgroup_root *root)
cgroup_free_root(root);
}
+/*
+ * Returned cgroup is without refcount but it's valid as long as cset pins it.
+ */
static inline struct cgroup *__cset_cgroup_from_root(struct css_set *cset,
struct cgroup_root *root)
{
struct cgroup *res_cgroup = NULL;
+ lockdep_assert_held(&css_set_lock);
+
if (cset == &init_css_set) {
res_cgroup = &root->cgrp;
} else if (root == &cgrp_dfl_root) {
@@ -1426,8 +1431,6 @@ current_cgns_cgroup_from_root(struct cgroup_root *root)
struct cgroup *res = NULL;
struct css_set *cset;
- lockdep_assert_held(&css_set_lock);
-
rcu_read_lock();
cset = current->nsproxy->cgroup_ns->root_cset;
@@ -1446,7 +1449,6 @@ static struct cgroup *cset_cgroup_from_root(struct css_set *cset,
struct cgroup *res = NULL;
lockdep_assert_held(&cgroup_mutex);
- lockdep_assert_held(&css_set_lock);
res = __cset_cgroup_from_root(cset, root);
@@ -1861,8 +1863,8 @@ int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node,
spin_lock_irq(&css_set_lock);
ns_cgroup = current_cgns_cgroup_from_root(kf_cgroot);
- len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX);
spin_unlock_irq(&css_set_lock);
+ len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX);
if (len >= PATH_MAX)
len = -ERANGE;
@@ -6649,8 +6651,8 @@ struct cgroup *cgroup_get_from_path(const char *path)
spin_lock_irq(&css_set_lock);
root_cgrp = current_cgns_cgroup_from_root(&cgrp_dfl_root);
- kn = kernfs_walk_and_get(root_cgrp->kn, path);
spin_unlock_irq(&css_set_lock);
+ kn = kernfs_walk_and_get(root_cgrp->kn, path);
if (!kn)
goto out;
base-commit: a8c52eba880a6e8c07fc2130604f8e386b90b763
--
2.37.0
Powered by blists - more mailing lists