linux-kernel - Re: [patch -mm 0/4] mm, memcg: introduce oom policies

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Wed, 17 Jan 2018 11:46:03 +0000
From:   Roman Gushchin <guro@...com>
To:     David Rientjes <rientjes@...gle.com>
CC:     Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...nel.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        Tejun Heo <tj@...nel.org>, <kernel-team@...com>,
        <cgroups@...r.kernel.org>, <linux-doc@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>
Subject: Re: [patch -mm 0/4] mm, memcg: introduce oom policies

On Tue, Jan 16, 2018 at 06:14:58PM -0800, David Rientjes wrote:
> There are three significant concerns about the cgroup aware oom killer as
> it is implemented in -mm:
> 
>  (1) allows users to evade the oom killer by creating subcontainers or
>      using other controllers since scoring is done per cgroup and not
>      hierarchically,
> 
>  (2) does not allow the user to influence the decisionmaking, such that
>      important subtrees cannot be preferred or biased, and
> 
>  (3) unfairly compares the root mem cgroup using completely different
>      criteria than leaf mem cgroups and allows wildly inaccurate results
>      if oom_score_adj is used.
> 
> This patchset aims to fix (1) completely and, by doing so, introduces a
> completely extensible user interface that can be expanded in the future.
> 
> It eliminates the mount option for the cgroup aware oom killer entirely
> since it is now enabled through the root mem cgroup's oom policy.
> 
> It eliminates a pointless tunable, memory.oom_group, that unnecessarily
> pollutes the mem cgroup v2 filesystem and is invalid when cgroup v2 is
> mounted with the "groupoom" option.

You're introducing a new oom_policy knob, which has two separate sets
of possible values for the root and non-root cgroups. I don't think
it aligns with the existing cgroup v2 design.

For the root cgroup it works exactly as mount option, and both "none"
and "cgroup" values have no meaning outside of the root cgroup. We can
discuss if a knob on root cgroup is better than a mount option, or not
(I don't think so), but it has nothing to do with oom policy as you
define it for non-root cgroups.

For non-root cgroups you're introducing "all" and "tree", and the _only_
difference is that in the "all" mode all processes will be killed, rather
than the biggest in the "tree". I find these names confusing, in reality
it's more "evaluate together and kill all" and "evaluate together and
kill one".

So, it's not really the fully hierarchical approach, which I thought,
you were arguing for. You can easily do the same with adding the third
value to the memory.groupoom knob, as I've suggested earlier (say, "disable,
"kill" and "evaluate"), and will be much less confusing.

Thanks!

Roman