[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHzjS_vp3xpCx8w9k7ct1RHOLnwu5og59Uxqs9DE_Ye06x3m4w@mail.gmail.com>
Date: Thu, 30 Oct 2025 10:56:46 -0700
From: Song Liu <song@...nel.org>
To: Tejun Heo <tj@...nel.org>
Cc: Song Liu <song@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>, 
	Andrew Morton <akpm@...ux-foundation.org>, linux-kernel@...r.kernel.org, 
	Alexei Starovoitov <ast@...nel.org>, Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...nel.org>, 
	Shakeel Butt <shakeel.butt@...ux.dev>, Johannes Weiner <hannes@...xchg.org>, 
	Andrii Nakryiko <andrii@...nel.org>, JP Kobryn <inwardvessel@...il.com>, linux-mm@...ck.org, 
	cgroups@...r.kernel.org, bpf@...r.kernel.org, 
	Martin KaFai Lau <martin.lau@...nel.org>, Kumar Kartikeya Dwivedi <memxor@...il.com>
Subject: Re: [PATCH v2 02/23] bpf: initial support for attaching struct ops to cgroups
On Thu, Oct 30, 2025 at 9:14 AM Tejun Heo <tj@...nel.org> wrote:
>
> Hello,
>
> On Wed, Oct 29, 2025 at 09:32:44PM -0700, Song Liu wrote:
> > If the use case is to attach a single struct_ops to a single cgroup, the author
> > of that BPF program can always ignore the memcg parameter and use
> > global variables, etc. We waste a register in BPF ISA to save the pointer to
> > memcg,  but JiT may recover that in native instructions.
> >
> > OTOH, starting without a memcg parameter, it will be impossible to allow
> > attaching the same struct_ops to different cgroups. I still think it is a valid
> > use case that the sysadmin loads a set of OOM handlers for users in the
> > containers to choose from is a valid use case.
>
> I find something like that being implemented through struct_ops attaching
> rather unlikely. Wouldn't it look more like the following?
>
> - Attach a handler at the parent level which implements different policies.
>
> - Child cgroups pick the desired policy using e.g. cgroup xattrs and when
>   OOM event happens, the OOM handler attached at the parent implements the
>   requested policy.
OK, using xattrs is another way to achieve this.
> - If further customization is desired and supported, it's implemented
>   through child loading its own OOM handler which operates under the
>   parent's OOM handler.
>
> > Also, a per cgroup oom handler may need to access the memcg information
> > anyway. Without a dedicated memcg argument, the user need to fetch it
> > somewhere else.
>
> An OOM handler attached to a cgroup doesn't just need to handle OOM events
> in the cgroup itself. It's responsible for the whole sub-hierarchy. ie. It
> will need accessors to reach all those memcgs anyway.
>
> Another thing to consider is that the memcg for a given cgroup can change by
> the controller being enabled and disabled. There isn't the one permanent
> memcg that a given cgroup is associated with.
In the current version, bpf_oom_ops is attached to the memcg. As long as
we feed a pointer to memcg to all struct_ops functions, these functions
can be implemented in a stateless way. I think having the option to do
this stateless implementation will help us in the long term.
Thanks,
Song
Powered by blists - more mailing lists
 
