[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230810081319.65668-1-zhouchuyi@bytedance.com>
Date: Thu, 10 Aug 2023 16:13:14 +0800
From: Chuyi Zhou <zhouchuyi@...edance.com>
To: hannes@...xchg.org, mhocko@...nel.org, roman.gushchin@...ux.dev,
ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
muchun.song@...ux.dev
Cc: bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
wuyun.abel@...edance.com, robin.lu@...edance.com,
Chuyi Zhou <zhouchuyi@...edance.com>
Subject: [RFC PATCH v2 0/5] mm: Select victim using bpf_oom_evaluate_task
Changes
-------
This is v2 of the BPF OOM policy patchset.
v1 : https://lore.kernel.org/lkml/20230804093804.47039-1-zhouchuyi@bytedance.com/
v1 -> v2 changes:
- rename bpf_select_task to bpf_oom_evaluate_task and bypass the
tsk_is_oom_victim (and MMF_OOM_SKIP) logic. (Michal)
- add a new hook to set policy's name, so dump_header() can know
what has been the selection policy when reporting messages. (Michal)
- add a tracepoint when select_bad_process() find nothing. (Alan)
- add a doc to to describe how it is all supposed to work. (Alan)
================
This patchset adds a new interface and use it to select victim when OOM
is invoked. The mainly motivation is the need to customizable OOM victim
selection functionality.
The new interface is a bpf hook plugged in oom_evaluate_task. It takes oc
and current task as parameters and return a result indicating which one is
selected by the attached bpf program.
There are several conserns when designing this interface suggested by
Michal:
1. Hooking into oom_evaluate_task can keep the consistency of global and
memcg OOM interface. Besides, it seems the least disruptive to the existing
oom killer implementation.
2. Userspace can handle a lot on its own and provide the input to the BPF
program to make a decision. Since the oom scope iteration will be
implemented already in the kernel so all the BPF program has to do is to
rank processes or memcgs.
3. The new interface should better bypass the current heuristic rules
(e.g., tsk_is_oom_victim, and MMF_OOM_SKIP) to meet an arbitrary oom
policy's need.
Chuyi Zhou (5):
mm, oom: Introduce bpf_oom_evaluate_task
mm: Add policy_name to identify OOM policies
mm: Add a tracepoint when OOM victim selection is failed
bpf: Add a OOM policy test
bpf: Add a BPF OOM policy Doc
Documentation/bpf/oom.rst | 70 +++++++++
include/linux/oom.h | 7 +
include/trace/events/oom.h | 18 +++
mm/oom_kill.c | 100 +++++++++++--
.../bpf/prog_tests/test_oom_policy.c | 140 ++++++++++++++++++
.../testing/selftests/bpf/progs/oom_policy.c | 104 +++++++++++++
6 files changed, 428 insertions(+), 11 deletions(-)
create mode 100644 Documentation/bpf/oom.rst
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_oom_policy.c
create mode 100644 tools/testing/selftests/bpf/progs/oom_policy.c
--
2.20.1
Powered by blists - more mailing lists