[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <93627e45-dc67-fd31-ef43-a93f580b0d6e@bytedance.com>
Date: Thu, 17 Aug 2023 10:51:19 +0800
From: Chuyi Zhou <zhouchuyi@...edance.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>, muchun.song@...ux.dev,
bpf <bpf@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>,
wuyun.abel@...edance.com, robin.lu@...edance.com,
Michal Hocko <mhocko@...e.com>
Subject: Re: [RFC PATCH v2 1/5] mm, oom: Introduce bpf_oom_evaluate_task
Hello,
在 2023/8/17 10:07, Alexei Starovoitov 写道:
> On Thu, Aug 10, 2023 at 1:13 AM Chuyi Zhou <zhouchuyi@...edance.com> wrote:
>> static int oom_evaluate_task(struct task_struct *task, void *arg)
>> {
>> struct oom_control *oc = arg;
>> @@ -317,6 +339,26 @@ static int oom_evaluate_task(struct task_struct *task, void *arg)
>> if (!is_memcg_oom(oc) && !oom_cpuset_eligible(task, oc))
>> goto next;
>>
>> + /*
>> + * If task is allocating a lot of memory and has been marked to be
>> + * killed first if it triggers an oom, then select it.
>> + */
>> + if (oom_task_origin(task)) {
>> + points = LONG_MAX;
>> + goto select;
>> + }
>> +
>> + switch (bpf_oom_evaluate_task(task, oc)) {
>> + case BPF_EVAL_ABORT:
>> + goto abort; /* abort search process */
>> + case BPF_EVAL_NEXT:
>> + goto next; /* ignore the task */
>> + case BPF_EVAL_SELECT:
>> + goto select; /* select the task */
>> + default:
>> + break; /* No BPF policy */
>> + }
>> +
>
> I think forcing bpf prog to look at every task is going to be limiting
> long term.
> It's more flexible to invoke bpf prog from out_of_memory()
> and if it doesn't choose a task then fallback to select_bad_process().
> I believe that's what Roman was proposing.
> bpf can choose to iterate memcg or it might have some side knowledge
> that there are processes that can be set as oc->chosen right away,
> so it can skip the iteration.
IIUC, We may need some new bpf features if we want to iterating
tasks/memcg in BPF, sush as:
bpf_for_each_task
bpf_for_each_memcg
bpf_for_each_task_in_memcg
...
It seems we have some work to do first in the BPF side.
Will these iterating features be useful in other BPF scenario except OOM
Policy?
Thanks.
Powered by blists - more mailing lists