lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXiHok-3SKqLncCf@tiehlicka>
Date: Tue, 27 Jan 2026 10:38:42 +0100
From: Michal Hocko <mhocko@...e.com>
To: Roman Gushchin <roman.gushchin@...ux.dev>
Cc: bpf@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>,
	Matt Bobrowski <mattbobrowski@...gle.com>,
	Shakeel Butt <shakeel.butt@...ux.dev>,
	JP Kobryn <inwardvessel@...il.com>, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, Suren Baghdasaryan <surenb@...gle.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH bpf-next v3 07/17] mm: introduce BPF OOM struct ops

On Mon 26-01-26 18:44:10, Roman Gushchin wrote:
> Introduce a bpf struct ops for implementing custom OOM handling
> policies.
> 
> It's possible to load one bpf_oom_ops for the system and one
> bpf_oom_ops for every memory cgroup. In case of a memcg OOM, the
> cgroup tree is traversed from the OOM'ing memcg up to the root and
> corresponding BPF OOM handlers are executed until some memory is
> freed. If no memory is freed, the kernel OOM killer is invoked.
> 
> The struct ops provides the bpf_handle_out_of_memory() callback,
> which expected to return 1 if it was able to free some memory and 0
> otherwise. If 1 is returned, the kernel also checks the bpf_memory_freed
> field of the oom_control structure, which is expected to be set by
> kfuncs suitable for releasing memory (which will be introduced later
> in the patch series). If both are set, OOM is considered handled,
> otherwise the next OOM handler in the chain is executed: e.g. BPF OOM
> attached to the parent cgroup or the kernel OOM killer.

I still find this dual reporting a bit confusing. I can see your
intention in having a pre-defined "releasers" of the memory to trust BPF
handlers more but they do have access to oc->bpf_memory_freed so they
can manipulate it. Therefore an additional level of protection is rather
weak. 

It is also not really clear to me how this works while there is OOM
victim on the way out. (i.e. tsk_is_oom_victim() -> abort case). This
will result in no killing therefore no bpf_memory_freed, right? Handler
itself should consider its work done. How exactly is this handled.

Also is there any way to handle the oom by increasing the memcg limit?
I do not see a callback for that.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ