lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170905143021.GA28599@castle.dhcp.TheFacebook.com>
Date:   Tue, 5 Sep 2017 15:30:21 +0100
From:   Roman Gushchin <guro@...com>
To:     Michal Hocko <mhocko@...nel.org>
CC:     <linux-mm@...ck.org>, Vladimir Davydov <vdavydov.dev@...il.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        David Rientjes <rientjes@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Tejun Heo <tj@...nel.org>, <kernel-team@...com>,
        <cgroups@...r.kernel.org>, <linux-doc@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware
 OOM killer

On Tue, Sep 05, 2017 at 03:44:12PM +0200, Michal Hocko wrote:
> I will go and check patch 2 more deeply but this is something that I
> wanted to sort out first.
> 
> On Mon 04-09-17 15:21:08, Roman Gushchin wrote:
> > Introducing of cgroup-aware OOM killer changes the victim selection
> > algorithm used by default: instead of picking the largest process,
> > it will pick the largest memcg and then the largest process inside.
> > 
> > This affects only cgroup v2 users.
> > 
> > To provide a way to use cgroups v2 if the old OOM victim selection
> > algorithm is preferred for some reason, the nogroupoom mount option
> > is added.
> > 
> > If set, the OOM selection is performed in a "traditional" per-process
> > way. Both oom_priority and oom_group memcg knobs are ignored.
> 
> Why is this an opt out rather than opt-in? IMHO the original oom logic
> should be preserved by default and specific workloads should opt in for
> the cgroup aware logic. Changing the global behavior depending on
> whether cgroup v2 interface is in use is more than unexpected and IMHO
> wrong approach to take. I think we should instead go with 
> oom_strategy=[alloc_task,biggest_task,cgroup]
> 
> we currently have alloc_task (via sysctl_oom_kill_allocating_task) and
> biggest_task which is the default. You are adding cgroup and the more I
> think about the more I agree that it doesn't really make sense to try to
> fit thew new semantic into the existing one (compare tasks to kill-all
> memcgs). Just introduce a new strategy and define a new semantic from
> scratch. Memcg priority and kill-all are a natural extension of this new
> strategy. This will make the life easier and easier to understand by
> users.
> 
> Does that make sense to you?

Absolutely.

The only thing: I'm not sure that we have to preserve the existing logic
as default option. For most users (except few very specific usecases),
it should be at least as good, as the existing one.

Making it opt-in means that corresponding code will be executed only
by few users, who cares. Then we should probably hide corresponding
cgroup interface (oom_group and oom_priority knobs) by default,
and it feels as unnecessary complication and is overall against
cgroup v2 interface design.

> I think we should instead go with
> oom_strategy=[alloc_task,biggest_task,cgroup]

It would be a really nice interface; although I've no idea how to implement it:
"alloc_task" is an existing sysctl, which we have to preserve;
while "cgroup" depends on cgroup v2.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ