linux-kernel - Re: [v11 3/6] mm, oom: cgroup-aware OOM killer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20171011130815.qjw7jfnnqz3gpn4s@dhcp22.suse.cz>
Date:   Wed, 11 Oct 2017 15:08:15 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     David Rientjes <rientjes@...gle.com>
Cc:     Roman Gushchin <guro@...com>, linux-mm@...ck.org,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Tejun Heo <tj@...nel.org>, kernel-team@...com,
        cgroups@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [v11 3/6] mm, oom: cgroup-aware OOM killer

On Tue 10-10-17 14:13:00, David Rientjes wrote:
[...]
> For these reasons: unfair comparison of root mem cgroup usage to bias 
> against that mem cgroup from oom kill in system oom conditions, the 
> ability of users to completely evade the oom killer by attaching all 
> processes to child cgroups either purposefully or unpurposefully, and the 
> inability of userspace to effectively control oom victim selection:
> 
> Nacked-by: David Rientjes <rientjes@...gle.com>

I consider this NACK rather dubious. Evading the heuristic as you
describe requires root privileges in default configuration because
normal users are not allowed to create subtrees. If you
really want to delegate subtree to an untrusted entity then you do not
have to opt-in for this oom strategy. We can work on an additional means
which would allow to cover those as well (e.g. priority based one which
is requested for other usecases).

A similar argument applies to the root memcg evaluation. While the
proposed behavior is not optimal it would work for general usecase
described here where the root memcg doesn't really run any large number
of tasks. If somebody who explicitly opts-in for the new strategy and it
doesn't work well for that usecase we can enhance the behavior. That
alone is not a reason to nack the whole thing.

I find it really disturbing that you keep nacking this approach just
because it doesn't suite your specific usecase while it doesn't break
it. Moreover it has been stated several times already that future
improvements are possible and cover what you have described already.
-- 
Michal Hocko
SUSE Labs