[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251121040655.89584-1-jiayuan.chen@linux.dev>
Date: Fri, 21 Nov 2025 12:06:55 +0800
From: Jiayuan Chen <jiayuan.chen@...ux.dev>
To: cgroups@...r.kernel.org
Cc: tj@...nel.org,
hannes@...xchg.org,
mkoutny@...e.com,
linux-kernel@...r.kernel.org,
Jiayuan Chen <jiayuan.chen@...ux.dev>
Subject: [PATCH v2] cgroup: Add preemption protection to css_rstat_updated()
BPF programs do not disable preemption, they only disable migration.
Therefore, when running the cgroup_hierarchical_stats selftest, a
warning [1] is generated.
The css_rstat_updated() function is lockless and reentrant. However,
as Tejun pointed out [2], preemption-related considerations need to
be considered. Since css_rstat_updated() can be called from BPF where
preemption is not disabled by its framework and it has already been
exposed as a kfunc to BPF programs, introducing a new kfunc like bpf_xx
will break existing uses. Thus, we directly make css_rstat_updated()
preempt-safe here.
[1]:
~/tools/testing/selftests/bpf$
test_progs -a cgroup_hierarchical_stats
...
------------[ cut here ]------------
WARNING: CPU: 0 PID: 382 at kernel/cgroup/rstat.c:84
Modules linked in:
RIP: 0010:css_rstat_updated+0x9d/0x160
...
PKRU: 55555554
Call Trace:
<TASK>
bpf_prog_16a1c2d081688506_counter+0x143/0x14e
bpf_trampoline_6442524909+0x4b/0xb7
cgroup_attach_task+0x5/0x330
? __cgroup_procs_write+0x1d7/0x2f0
cgroup_procs_write+0x17/0x30
cgroup_file_write+0xa6/0x2d0
kernfs_fop_write_iter+0x188/0x240
vfs_write+0x2da/0x5a0
ksys_write+0x77/0x100
__x64_sys_write+0x19/0x30
x64_sys_call+0x79/0x26a0
do_syscall_64+0x89/0x790
? irqentry_exit+0x77/0xb0
? __this_cpu_preempt_check+0x13/0x20
? lockdep_hardirqs_on+0xce/0x170
? irqentry_exit_to_user_mode+0xf2/0x290
? irqentry_exit+0x77/0xb0
? clear_bhb_loop+0x50/0xa0
? clear_bhb_loop+0x50/0xa0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
---[ end trace 0000000000000000 ]---
[2]: https://lore.kernel.org/cgroups/445b0d155b7a3cb84452aa7010669e293e8c37db@linux.dev/T/
Signed-off-by: Jiayuan Chen <jiayuan.chen@...ux.dev>
---
v1 -> v2: Add preemption protection instread of dropping preemption
assert
v1: https://lore.kernel.org/cgroups/445b0d155b7a3cb84452aa7010669e293e8c37db@linux.dev/T/
---
kernel/cgroup/rstat.c | 39 +++++++++++++++++++++++----------------
1 file changed, 23 insertions(+), 16 deletions(-)
diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index a198e40c799b..ec68c653545c 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -52,22 +52,7 @@ static inline struct llist_head *ss_lhead_cpu(struct cgroup_subsys *ss, int cpu)
return per_cpu_ptr(&rstat_backlog_list, cpu);
}
-/**
- * css_rstat_updated - keep track of updated rstat_cpu
- * @css: target cgroup subsystem state
- * @cpu: cpu on which rstat_cpu was updated
- *
- * Atomically inserts the css in the ss's llist for the given cpu. This is
- * reentrant safe i.e. safe against softirq, hardirq and nmi. The ss's llist
- * will be processed at the flush time to create the update tree.
- *
- * NOTE: if the user needs the guarantee that the updater either add itself in
- * the lockless list or the concurrent flusher flushes its updated stats, a
- * memory barrier is needed before the call to css_rstat_updated() i.e. a
- * barrier after updating the per-cpu stats and before calling
- * css_rstat_updated().
- */
-__bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu)
+static void __css_rstat_updated(struct cgroup_subsys_state *css, int cpu)
{
struct llist_head *lhead;
struct css_rstat_cpu *rstatc;
@@ -122,6 +107,28 @@ __bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu)
llist_add(&rstatc->lnode, lhead);
}
+/**
+ * css_rstat_updated - keep track of updated rstat_cpu
+ * @css: target cgroup subsystem state
+ * @cpu: cpu on which rstat_cpu was updated
+ *
+ * Atomically inserts the css in the ss's llist for the given cpu. This is
+ * reentrant safe i.e. safe against softirq, hardirq and nmi. The ss's llist
+ * will be processed at the flush time to create the update tree.
+ *
+ * NOTE: if the user needs the guarantee that the updater either add itself in
+ * the lockless list or the concurrent flusher flushes its updated stats, a
+ * memory barrier is needed before the call to css_rstat_updated() i.e. a
+ * barrier after updating the per-cpu stats and before calling
+ * css_rstat_updated().
+ */
+__bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu)
+{
+ preempt_disable();
+ __css_rstat_updated(css, cpu);
+ preempt_enable();
+}
+
static void __css_process_update_tree(struct cgroup_subsys_state *css, int cpu)
{
/* put @css and all ancestors on the corresponding updated lists */
--
2.43.0
Powered by blists - more mailing lists