[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250722014030.297537-1-inwardvessel@gmail.com>
Date: Mon, 21 Jul 2025 18:40:25 -0700
From: JP Kobryn <inwardvessel@...il.com>
To: tj@...nel.org,
shakeel.butt@...ux.dev,
mkoutny@...e.com,
yosryahmed@...gle.com,
hannes@...xchg.org,
akpm@...ux-foundation.org
Cc: linux-kernel@...r.kernel.org,
cgroups@...r.kernel.org,
kernel-team@...a.com
Subject: [PATCH 0/5 cgroup/for-6.16-fixes] harden css_create() for safe placement of call to css_rstat_init()
Within css_create() there are several error paths that lead to async
cleanup of a given css. An warning was found [0] which appears to be the
result of calling css_rstat_exit() on this cleanup path with a css having
uninitialized rstat objects. This series makes adjustments to provide up a
place in css_create() where css_rstat_init() can both 1) fail gracefully
and 2) guarantee completion before async cleanup leading to
css_rstat_exit() occurs.
The core change is splitting init_and_link_css() into two separate
functions and calling css_rstat_init() between them. The reason for this is
that because of the refcount incrementing that occurs within, this function
puts a constraint on the error handling that follows. See excerpt below:
css_create(...)
{
...
init_and_link_css(css, ...);
/* any subsequent error needs async css cleanup */
err = percpu_ref_init(...);
if (err)
goto err_free_css;
err = cgroup_idr_alloc(...);
if (err)
goto err_free_css;
err = css_rstat_init(css, ...);
if (err)
goto err_free_css;
...
err_free_css:
INIT_RCU_WORK(&css->destroy_rwork, css_free_rwork_fn);
queue_rcu_work(cgroup_destroy_wq, &css->destroy_rwork);
return ERR_PTR(err);
}
If any of the three goto jumps are taken, async cleanup will begin and
css_rstat_exit() will be invoked. But since css_rstat_init() would not have
succeeded, the warning will eventually be reached.
In this series, the code above changes to resemble something like this:
css_create(...)
{
...
init_css(css);
if (css->css_rstat_flush) {
err = css_rstat_init(css)
if (err) {
ss->css_free(css);
return ERR_PTR(err);
}
}
link_css(css, ...);
...
}
There was some refactoring done to eliminate self-imposed constraints on
where css_rstat_init() may be called. For example, one previous constraint
was that css_rstat_init() could only be called after init_and_link_css()
since that is where it gets its subsystem and cgroup fields set, and
css->ss was used to determine how to allocate rstat (base or subsystem).
The changes in this series remove this constraint by eliminating the
dependency of subsystem/cgroup information in rstat init/exit. The
resulting init/exit routines can work exclusively on rstat entities and not
be concerned with other types of entities.
The refactoring may also improve maintainability. Existing code like this:
css_rstat_init(&cgrp->self); /* for base state */
css_rstat_init(css); /* for subsystems */
... will now look like this:
cgroup_rstat_base_init(cgrp); /* for base stats */
css_rstat_init(css); /* for subsystems */
Finally, the number of nested conditionals in two functions above has been
reduced compared to the amount previously in css_rstat_init/exit(). This
could help improve code readability.
[0] https://lore.kernel.org/all/684c5802.a00a0220.279073.0016.GAE@google.com/
JP Kobryn (5):
cgroup: add exclusive css rstat init/exit api for base stats
cgroup: check for rstat flush callback at css rstat init/exit call
sites
cgroup: split init_and_link_css()
cgroup: initialize css rstat before linking to cgroups in css_create()
cgroup: break up the internal rstat init/exit logic by subsys and base
kernel/cgroup/cgroup-internal.h | 2 +
kernel/cgroup/cgroup.c | 57 ++++++++++++++++++----------
kernel/cgroup/rstat.c | 67 +++++++++++++++++----------------
3 files changed, 74 insertions(+), 52 deletions(-)
--
2.47.1
Powered by blists - more mailing lists