[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHzjS_uqFLEzvU0PTQiXajdFDsjC4gfk0Z4qMoiRQJ2uVPw6BA@mail.gmail.com>
Date: Wed, 29 Oct 2025 21:32:44 -0700
From: Song Liu <song@...nel.org>
To: Tejun Heo <tj@...nel.org>
Cc: Song Liu <song@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>, 
	Andrew Morton <akpm@...ux-foundation.org>, linux-kernel@...r.kernel.org, 
	Alexei Starovoitov <ast@...nel.org>, Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...nel.org>, 
	Shakeel Butt <shakeel.butt@...ux.dev>, Johannes Weiner <hannes@...xchg.org>, 
	Andrii Nakryiko <andrii@...nel.org>, JP Kobryn <inwardvessel@...il.com>, linux-mm@...ck.org, 
	cgroups@...r.kernel.org, bpf@...r.kernel.org, 
	Martin KaFai Lau <martin.lau@...nel.org>, Kumar Kartikeya Dwivedi <memxor@...il.com>
Subject: Re: [PATCH v2 02/23] bpf: initial support for attaching struct ops to cgroups
On Wed, Oct 29, 2025 at 2:45 PM Tejun Heo <tj@...nel.org> wrote:
>
> Hello,
>
> On Wed, Oct 29, 2025 at 02:37:38PM -0700, Song Liu wrote:
> > On Wed, Oct 29, 2025 at 2:27 PM Tejun Heo <tj@...nel.org> wrote:
> > > Doesn't that assume that the programs are more or less stateless? Wouldn't
> > > oom handlers want to track historical information, running averages, which
> > > process expanded the most and so on?
> >
> > Yes, this does mean the program needs to store data in some BPF maps.
> > Do we have concern with the performance of BPF maps?
>
> It's just a lot more awkward to do and I have a difficult time thinking up
> reasons why one would need to do that. If you attach a single struct_ops
> instance to one cgroup, you can use global variables, maps, arena to track
> what's happening with the cgroup. If you share the same struct_ops across
> multiple cgroups, each operation has to scope per-cgroup states. I can see
> how that probably makes sense for sockets but cgroups aren't sockets. There
> are a lot fewer cgroups and they are organized in a tree.
If the use case is to attach a single struct_ops to a single cgroup, the author
of that BPF program can always ignore the memcg parameter and use
global variables, etc. We waste a register in BPF ISA to save the pointer to
memcg,  but JiT may recover that in native instructions.
OTOH, starting without a memcg parameter, it will be impossible to allow
attaching the same struct_ops to different cgroups. I still think it is a valid
use case that the sysadmin loads a set of OOM handlers for users in the
containers to choose from is a valid use case.
Also, a per cgroup oom handler may need to access the memcg information
anyway. Without a dedicated memcg argument, the user need to fetch it
somewhere else.
Thanks,
Song
Powered by blists - more mailing lists
 
