linux-kernel - Re: [RFC PATCH 0/5] mm: Select victim memcg using BPF_OOM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZMe17kOoHr/eYnVT@dhcp22.suse.cz>
Date:   Mon, 31 Jul 2023 15:23:58 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Chuyi Zhou <zhouchuyi@...edance.com>
Cc:     hannes@...xchg.org, roman.gushchin@...ux.dev, ast@...nel.org,
        daniel@...earbox.net, andrii@...nel.org, bpf@...r.kernel.org,
        linux-kernel@...r.kernel.org, wuyun.abel@...edance.com,
        robin.lu@...edance.com, muchun.song@...ux.dev,
        zhengqi.arch@...edance.com
Subject: Re: [RFC PATCH 0/5] mm: Select victim memcg using BPF_OOM_POLICY

On Mon 31-07-23 14:00:22, Chuyi Zhou wrote:
> Hello, Michal
> 
> 在 2023/7/28 01:23, Michal Hocko 写道:
[...]
> > This sounds like a very specific oom policy and that is fine. But the
> > interface shouldn't be bound to any concepts like priorities let alone
> > be bound to memcg based selection. Ideally the BPF program should get
> > the oom_control as an input and either get a hook to kill process or if
> > that is not possible then return an entity to kill (either process or
> > set of processes).
> 
> Here are two interfaces I can think of. I was wondering if you could give me
> some feedback.
> 
> 1. Add a new hook in select_bad_process(), we can attach it and return a set
> of pids or cgroup_ids which are pre-selected by user-defined policy,
> suggested by Roman. Then we could use oom_evaluate_task to find a final
> victim among them. It's user-friendly and we can offload the OOM policy to
> userspace.
> 
> 2. Add a new hook in oom_evaluate_task() and return a point to override the
> default oom_badness return-value. The simplest way to use this is to protect
> certain processes by setting the minimum score.
> 
> Of course if you have a better idea, please let me know.

Hooking into oom_evaluate_task seems the least disruptive to the
existing oom killer implementation. I would start by planing with that
and see whether useful oom policies could be defined this way. I am not
sure what is the best way to communicate user input so that a BPF prgram
can consume it though. The interface should be generic enough that it
doesn't really pre-define any specific class of policies. Maybe we can
add something completely opaque to each memcg/task? Does BPF
infrastructure allow anything like that already?

-- 
Michal Hocko
SUSE Labs