[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4555470c-ea4f-244b-ed40-9403df3f5e4f@bytedance.com>
Date: Fri, 28 Jul 2023 10:34:38 +0800
From: Chuyi Zhou <zhouchuyi@...edance.com>
To: Alan Maguire <alan.maguire@...cle.com>, hannes@...xchg.org,
mhocko@...nel.org, roman.gushchin@...ux.dev, ast@...nel.org,
daniel@...earbox.net, andrii@...nel.org
Cc: bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
wuyun.abel@...edance.com, robin.lu@...edance.com
Subject: Re: [External] Re: [RFC PATCH 0/5] mm: Select victim memcg using
BPF_OOM_POLICY
Hi,
在 2023/7/27 19:43, Alan Maguire 写道:
> On 27/07/2023 08:36, Chuyi Zhou wrote:
>> This patchset tries to add a new bpf prog type and use it to select
>> a victim memcg when global OOM is invoked. The mainly motivation is
>> the need to customizable OOM victim selection functionality so that
>> we can protect more important app from OOM killer.
>>
>
> It's a nice use case, but at a high level, the approach pursued here
> is, as I understand it, discouraged for new BPF program development.
> Specifically, adding a new BPF program type with semantics like this
> is not preferred. Instead, can you look at using something like
>
> - using "fmod_ret" instead of a new program type
> - use BPF kfuncs instead of helpers.
> - add selftests in tools/testing/selftests/bpf not samples.
>
> There's some examples of how solutions have evolved from the traditional
> approach (adding a new program type, helpers etc) to using kfuncs etc on
> this list - for example HID-BPF and the BPF scheduler series - which
> should help orient you. There are presentations from Linux Plumbers 2022
> that walk through some of this too.
>
> Judging by the sample program example, all you should need here is a way
> to override the return value of bpf_oom_set_policy() - a noinline
> function that by default returns a no-op. It can then be overridden by
> an "fmod_ret" BPF program.
>
Indeed, I'll try to use kfuncs & fmod_ret.
Thanks for your advice.
--
Chuyi Zhou
> One thing you lose is cgroup specificity at BPF attach time, but you can
> always add predicates based on the cgroup to your BPF program if needed.
>
> Alan
>
>> Chuyi Zhou (5):
>> bpf: Introduce BPF_PROG_TYPE_OOM_POLICY
>> mm: Select victim memcg using bpf prog
>> libbpf, bpftool: Support BPF_PROG_TYPE_OOM_POLICY
>> bpf: Add a new bpf helper to get cgroup ino
>> bpf: Sample BPF program to set oom policy
>>
>> include/linux/bpf_oom.h | 22 ++++
>> include/linux/bpf_types.h | 2 +
>> include/linux/memcontrol.h | 6 ++
>> include/uapi/linux/bpf.h | 21 ++++
>> kernel/bpf/core.c | 1 +
>> kernel/bpf/helpers.c | 17 +++
>> kernel/bpf/syscall.c | 10 ++
>> mm/memcontrol.c | 50 +++++++++
>> mm/oom_kill.c | 185 +++++++++++++++++++++++++++++++++
>> samples/bpf/Makefile | 3 +
>> samples/bpf/oom_kern.c | 42 ++++++++
>> samples/bpf/oom_user.c | 128 +++++++++++++++++++++++
>> tools/bpf/bpftool/common.c | 1 +
>> tools/include/uapi/linux/bpf.h | 21 ++++
>> tools/lib/bpf/libbpf.c | 3 +
>> tools/lib/bpf/libbpf_probes.c | 2 +
>> 16 files changed, 514 insertions(+)
>> create mode 100644 include/linux/bpf_oom.h
>> create mode 100644 samples/bpf/oom_kern.c
>> create mode 100644 samples/bpf/oom_user.c
>>
Powered by blists - more mailing lists