lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZIgodGWoC/R07eak@dhcp22.suse.cz>
Date:   Tue, 13 Jun 2023 10:27:32 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Yosry Ahmed <yosryahmed@...gle.com>
Cc:     程垲涛 Chengkaitao Cheng 
        <chengkaitao@...iglobal.com>, "tj@...nel.org" <tj@...nel.org>,
        "lizefan.x@...edance.com" <lizefan.x@...edance.com>,
        "hannes@...xchg.org" <hannes@...xchg.org>,
        "corbet@....net" <corbet@....net>,
        "roman.gushchin@...ux.dev" <roman.gushchin@...ux.dev>,
        "shakeelb@...gle.com" <shakeelb@...gle.com>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "brauner@...nel.org" <brauner@...nel.org>,
        "muchun.song@...ux.dev" <muchun.song@...ux.dev>,
        "viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
        "zhengqi.arch@...edance.com" <zhengqi.arch@...edance.com>,
        "ebiederm@...ssion.com" <ebiederm@...ssion.com>,
        "Liam.Howlett@...cle.com" <Liam.Howlett@...cle.com>,
        "chengzhihao1@...wei.com" <chengzhihao1@...wei.com>,
        "pilgrimtao@...il.com" <pilgrimtao@...il.com>,
        "haolee.swjtu@...il.com" <haolee.swjtu@...il.com>,
        "yuzhao@...gle.com" <yuzhao@...gle.com>,
        "willy@...radead.org" <willy@...radead.org>,
        "vasily.averin@...ux.dev" <vasily.averin@...ux.dev>,
        "vbabka@...e.cz" <vbabka@...e.cz>,
        "surenb@...gle.com" <surenb@...gle.com>,
        "sfr@...b.auug.org.au" <sfr@...b.auug.org.au>,
        "mcgrof@...nel.org" <mcgrof@...nel.org>,
        "sujiaxun@...ontech.com" <sujiaxun@...ontech.com>,
        "feng.tang@...el.com" <feng.tang@...el.com>,
        "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [PATCH v3 0/2] memcontrol: support cgroup level OOM protection

On Sun 04-06-23 01:25:42, Yosry Ahmed wrote:
[...]
> There has been a parallel discussion in the cover letter thread of v4
> [1]. To summarize, at Google, we have been using OOM scores to
> describe different job priorities in a more explicit way -- regardless
> of memory usage. It is strictly priority-based OOM killing. Ties are
> broken based on memory usage.
> 
> We understand that something like memory.oom.protect has an advantage
> in the sense that you can skip killing a process if you know that it
> won't free enough memory anyway, but for an environment where multiple
> jobs of different priorities are running, we find it crucial to be
> able to define strict ordering. Some jobs are simply more important
> than others, regardless of their memory usage.

I do remember that discussion. I am not a great fan of simple priority
based interfaces TBH. It sounds as an easy interface but it hits
complications as soon as you try to define a proper/sensible
hierarchical semantic. I can see how they might work on leaf memcgs with
statically assigned priorities but that sounds like a very narrow
usecase IMHO.

I do not think we can effort a plethora of different OOM selection
algorithms implemented in the kernel. Therefore we should really
consider a control interface to be as much extensible and in line
with the existing interfaces as much as possible. That is why I am
really open to the oom protection concept which fits reasonably well
to the reclaim protection scheme. After all oom killer is just a very
aggressive method of the memory reclaim.

On the other hand I can see a need to customizable OOM victim selection
functionality. We've been through that discussion on several other
occasions and the best thing we could come up with was to allow to plug
BPF into the victim selection process and allow to bypass the system
default method. No code has ever materialized from those discussions
though. Maybe this is the time to revive that idea again?
 
> It would be great if we can arrive at an interface that serves this
> use case as well.
> 
> Thanks!
> 
> [1]https://lore.kernel.org/linux-mm/CAJD7tkaQdSTDX0Q7zvvYrA3Y4TcvLdWKnN3yc8VpfWRpUjcYBw@mail.gmail.com/
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ