[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZMe17kOoHr/eYnVT@dhcp22.suse.cz>
Date: Mon, 31 Jul 2023 15:23:58 +0200
From: Michal Hocko <mhocko@...e.com>
To: Chuyi Zhou <zhouchuyi@...edance.com>
Cc: hannes@...xchg.org, roman.gushchin@...ux.dev, ast@...nel.org,
daniel@...earbox.net, andrii@...nel.org, bpf@...r.kernel.org,
linux-kernel@...r.kernel.org, wuyun.abel@...edance.com,
robin.lu@...edance.com, muchun.song@...ux.dev,
zhengqi.arch@...edance.com
Subject: Re: [RFC PATCH 0/5] mm: Select victim memcg using BPF_OOM_POLICY
On Mon 31-07-23 14:00:22, Chuyi Zhou wrote:
> Hello, Michal
>
> 在 2023/7/28 01:23, Michal Hocko 写道:
[...]
> > This sounds like a very specific oom policy and that is fine. But the
> > interface shouldn't be bound to any concepts like priorities let alone
> > be bound to memcg based selection. Ideally the BPF program should get
> > the oom_control as an input and either get a hook to kill process or if
> > that is not possible then return an entity to kill (either process or
> > set of processes).
>
> Here are two interfaces I can think of. I was wondering if you could give me
> some feedback.
>
> 1. Add a new hook in select_bad_process(), we can attach it and return a set
> of pids or cgroup_ids which are pre-selected by user-defined policy,
> suggested by Roman. Then we could use oom_evaluate_task to find a final
> victim among them. It's user-friendly and we can offload the OOM policy to
> userspace.
>
> 2. Add a new hook in oom_evaluate_task() and return a point to override the
> default oom_badness return-value. The simplest way to use this is to protect
> certain processes by setting the minimum score.
>
> Of course if you have a better idea, please let me know.
Hooking into oom_evaluate_task seems the least disruptive to the
existing oom killer implementation. I would start by planing with that
and see whether useful oom policies could be defined this way. I am not
sure what is the best way to communicate user input so that a BPF prgram
can consume it though. The interface should be generic enough that it
doesn't really pre-define any specific class of policies. Maybe we can
add something completely opaque to each memcg/task? Does BPF
infrastructure allow anything like that already?
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists