linux-kernel - Re: [PATCH v2 02/23] bpf: initial support for attaching struct ops to cgroups

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHzjS_ub0KBECge8DhaEZts1aYL5bBFaU=fJ3U+ZV5XdSjq1WQ@mail.gmail.com>
Date: Thu, 30 Oct 2025 17:05:23 -0700
From: Song Liu <song@...nel.org>
To: Martin KaFai Lau <martin.lau@...ux.dev>
Cc: Song Liu <song@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>, 
	Amery Hung <ameryhung@...il.com>, Andrew Morton <akpm@...ux-foundation.org>, 
	linux-kernel@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>, 
	Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...nel.org>, 
	Shakeel Butt <shakeel.butt@...ux.dev>, Johannes Weiner <hannes@...xchg.org>, 
	Andrii Nakryiko <andrii@...nel.org>, JP Kobryn <inwardvessel@...il.com>, linux-mm@...ck.org, 
	cgroups@...r.kernel.org, bpf@...r.kernel.org, 
	Martin KaFai Lau <martin.lau@...nel.org>, Kumar Kartikeya Dwivedi <memxor@...il.com>, Tejun Heo <tj@...nel.org>
Subject: Re: [PATCH v2 02/23] bpf: initial support for attaching struct ops to cgroups

On Thu, Oct 30, 2025 at 3:42 PM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
[...]
>
> The link can be detached (struct_ops's unreg) by the user space.
>
> The link can also be detached from the subsystem (cgroup) here.
> It was requested by scx:
> https://lore.kernel.org/all/20240530065946.979330-7-thinker.li@gmail.com/
>
> Not sure if scx has started using it.

I see. The user space can poll the link fd, and get notified when the
cgroup is removed.

> >
> >> 3) Move the attachment out of .reg() scope entirely. reg() will register
> >> the implementation system-wide and then some 3rd-party interface
> >> (e.g. cgroupfs) should be used to select the implementation.
> >>
> >>    +: ?
> >>    -: New hard-coded interfaces might be required to enable bpf-driven
> >>       kernel customization. The "attachment" code is not shared between
> >>       various struct ops cases.
> >>       Implementing stateful struct ops'es is harder and requires passing
> >>       an additional argument (some sort of "self") to all callbacks.
> >>
> >> This approach works well for cases when there is already a selection
> >> of implementations (e.g. tcp congestion mechanisms), and bpf is adding
> >> another one.
> >
> > Another benefit of 3) is that it allows loading an OOM controller in a
> > kernel module, just like loading a file system in a kernel module. This
> > is possible with 3) because we paid the cost of adding a new select
> > attach interface.
> >
> > A semi-separate topic, option 2) enables attaching a BPF program
> > to a kernel object (a cgroup here, but could be something else). This
> > is an interesting idea, and we may find it useful in other cases (attach
> > a BPF program to a task_struct, etc.).
>
> Does it have plan for a pure kernel module oom implementation?
> I think the link-to-cgrp support here does not necessary stop the
> later write to cgroupfs support if a kernel module oom is indeed needed
> in the future.

I am not aware of use cases to write OOM handlers in modules. Also
agreed that adding attach to cgroup link doesn't stop us from using
modules in the future.

Thanks,
Song