[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHzjS_vhk6RM6pkfKNrDNeEC=eObofL=f9FZ51tyqrFFz9tn1w@mail.gmail.com>
Date: Wed, 29 Oct 2025 14:18:00 -0700
From: Song Liu <song@...nel.org>
To: Tejun Heo <tj@...nel.org>
Cc: Roman Gushchin <roman.gushchin@...ux.dev>, Andrew Morton <akpm@...ux-foundation.org>, 
	linux-kernel@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>, 
	Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...nel.org>, 
	Shakeel Butt <shakeel.butt@...ux.dev>, Johannes Weiner <hannes@...xchg.org>, 
	Andrii Nakryiko <andrii@...nel.org>, JP Kobryn <inwardvessel@...il.com>, linux-mm@...ck.org, 
	cgroups@...r.kernel.org, bpf@...r.kernel.org, 
	Martin KaFai Lau <martin.lau@...nel.org>, Song Liu <song@...nel.org>, 
	Kumar Kartikeya Dwivedi <memxor@...il.com>
Subject: Re: [PATCH v2 02/23] bpf: initial support for attaching struct ops to cgroups
Hi Tejun,
On Wed, Oct 29, 2025 at 1:36 PM Tejun Heo <tj@...nel.org> wrote:
>
> On Wed, Oct 29, 2025 at 01:25:52PM -0700, Roman Gushchin wrote:
> > > BTW, for sched_ext sub-sched support, I'm just adding cgroup_id to
> > > struct_ops, which seems to work fine. It'd be nice to align on the same
> > > approach. What are the benefits of doing this through fd?
> >
> > Then you can attach a single struct ops to multiple cgroups (or Idk
> > sockets or processes or some other objects in the future).
> > And IMO it's just a more generic solution.
>
> I'm not very convinced that sharing a single struct_ops instance across
> multiple cgroups would be all that useful. If you map this to normal
> userspace programs, a given struct_ops instance is package of code and all
> the global data (maps). ie. it's not like running the same program multiple
> times against different targets. It's more akin to running a single program
> instance which can handle multiple targets.
>
> Maybe that's useful in some cases, but that program would have to explicitly
> distinguish the cgroups that it's attached to. I have a hard time imagining
> use cases where a single struct_ops has to service multiple disjoint cgroups
> in the hierarchy and it ends up stepping outside of the usual operation
> model of cgroups - commonality being expressed through the hierarchical
> structure.
How about we pass a pointer to mem_cgroup (and/or related pointers)
to all the callbacks in the struct_ops? AFAICT, in-kernel _ops structures like
struct file_operations and struct tcp_congestion_ops use this method. And
we can actually implement struct tcp_congestion_ops in BPF. With the
struct tcp_congestion_ops model, the struct_ops map and the struct_ops
link are both shared among multiple instances (sockets).
With this model, the system admin with root access can load a bunch of
available oom handlers, and users in their container can pick a preferred
oom handler for the sub cgroup. AFAICT, the users in the container can
pick the proper OOM handler without CAP_BPF. Does this sound useful
for some cases?
Thanks,
Song
Powered by blists - more mailing lists
 
