[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250818170136.209169-4-roman.gushchin@linux.dev>
Date: Mon, 18 Aug 2025 10:01:25 -0700
From: Roman Gushchin <roman.gushchin@...ux.dev>
To: linux-mm@...ck.org,
bpf@...r.kernel.org
Cc: Suren Baghdasaryan <surenb@...gle.com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...e.com>,
David Rientjes <rientjes@...gle.com>,
Matt Bobrowski <mattbobrowski@...gle.com>,
Song Liu <song@...nel.org>,
Kumar Kartikeya Dwivedi <memxor@...il.com>,
Alexei Starovoitov <ast@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org,
Roman Gushchin <roman.gushchin@...ux.dev>
Subject: [PATCH v1 03/14] mm: introduce bpf_oom_kill_process() bpf kfunc
Introduce bpf_oom_kill_process() bpf kfunc, which is supposed
to be used by bpf OOM programs. It allows to kill a process
in exactly the same way the OOM killer does: using the OOM reaper,
bumping corresponding memcg and global statistics, respecting
memory.oom.group etc.
On success, it sets om_control's bpf_memory_freed field to true,
enabling the bpf program to bypass the kernel OOM killer.
Signed-off-by: Roman Gushchin <roman.gushchin@...ux.dev>
---
mm/oom_kill.c | 67 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 67 insertions(+)
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index ad7bd65061d6..25fc5e744e27 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -1282,3 +1282,70 @@ SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags)
return -ENOSYS;
#endif /* CONFIG_MMU */
}
+
+#ifdef CONFIG_BPF_SYSCALL
+
+__bpf_kfunc_start_defs();
+/**
+ * bpf_oom_kill_process - Kill a process as OOM killer
+ * @oc: pointer to oom_control structure, describes OOM context
+ * @task: task to be killed
+ * @message__str: message to print in dmesg
+ *
+ * Kill a process in a way similar to the kernel OOM killer.
+ * This means dump the necessary information to dmesg, adjust memcg
+ * statistics, leverage the oom reaper, respect memory.oom.group etc.
+ *
+ * bpf_oom_kill_process() marks the forward progress by setting
+ * oc->bpf_memory_freed. If the progress was made, the bpf program
+ * is free to decide if the kernel oom killer should be invoked.
+ * Otherwise it's enforced, so that a bad bpf program can't
+ * deadlock the machine on memory.
+ */
+__bpf_kfunc int bpf_oom_kill_process(struct oom_control *oc,
+ struct task_struct *task,
+ const char *message__str)
+{
+ if (oom_unkillable_task(task))
+ return -EPERM;
+
+ /* paired with put_task_struct() in oom_kill_process() */
+ task = tryget_task_struct(task);
+ if (!task)
+ return -EINVAL;
+
+ oc->chosen = task;
+
+ oom_kill_process(oc, message__str);
+
+ oc->chosen = NULL;
+ oc->bpf_memory_freed = true;
+
+ return 0;
+}
+
+__bpf_kfunc_end_defs();
+
+BTF_KFUNCS_START(bpf_oom_kfuncs)
+BTF_ID_FLAGS(func, bpf_oom_kill_process, KF_SLEEPABLE | KF_TRUSTED_ARGS)
+BTF_KFUNCS_END(bpf_oom_kfuncs)
+
+static const struct btf_kfunc_id_set bpf_oom_kfunc_set = {
+ .owner = THIS_MODULE,
+ .set = &bpf_oom_kfuncs,
+};
+
+static int __init bpf_oom_init(void)
+{
+ int err;
+
+ err = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
+ &bpf_oom_kfunc_set);
+ if (err)
+ pr_warn("error while registering bpf oom kfuncs: %d", err);
+
+ return err;
+}
+late_initcall(bpf_oom_init);
+
+#endif
--
2.50.1
Powered by blists - more mailing lists