linux-kernel - Re: [RESEND v12 0/6] cgroup-aware OOM killer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c0393a4f-3515-b75d-5a00-f95c8284c275@sonymobile.com>
Date:   Tue, 31 Oct 2017 15:17:11 +0100
From:   peter enderborg <peter.enderborg@...ymobile.com>
To:     Johannes Weiner <hannes@...xchg.org>,
        David Rientjes <rientjes@...gle.com>
CC:     Andrew Morton <akpm@...ux-foundation.org>, <linux-mm@...ck.org>,
        Michal Hocko <mhocko@...e.com>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        Tejun Heo <tj@...nel.org>, <kernel-team@...com>,
        <cgroups@...r.kernel.org>, <linux-doc@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, Roman Gushchin <guro@...com>
Subject: Re: [RESEND v12 0/6] cgroup-aware OOM killer

On 10/27/2017 10:05 PM, Johannes Weiner wrote:
> On Thu, Oct 26, 2017 at 02:03:41PM -0700, David Rientjes wrote:
>> On Thu, 26 Oct 2017, Johannes Weiner wrote:
>>
>>>> The nack is for three reasons:
>>>>
>>>>  (1) unfair comparison of root mem cgroup usage to bias against that mem 
>>>>      cgroup from oom kill in system oom conditions,
>>>>
>>>>  (2) the ability of users to completely evade the oom killer by attaching
>>>>      all processes to child cgroups either purposefully or unpurposefully,
>>>>      and
>>>>
>>>>  (3) the inability of userspace to effectively control oom victim  
>>>>      selection.
>>> My apologies if my summary was too reductionist.
>>>
>>> That being said, the arguments you repeat here have come up in
>>> previous threads and been responded to. This doesn't change my
>>> conclusion that your NAK is bogus.
>> They actually haven't been responded to, Roman was working through v11 and 
>> made a change on how the root mem cgroup usage was calculated that was 
>> better than previous iterations but still not an apples to apples 
>> comparison with other cgroups.  The problem is that it the calculation for 
>> leaf cgroups includes additional memory classes, so it biases against 
>> processes that are moved to non-root mem cgroups.  Simply creating mem 
>> cgroups and attaching processes should not independently cause them to 
>> become more preferred: it should be a fair comparison between the root mem 
>> cgroup and the set of leaf mem cgroups as implemented.  That is very 
>> trivial to do with hierarchical oom cgroup scoring.
> There is absolutely no value in your repeating the same stuff over and
> over again without considering what other people are telling you.
>
> Hierarchical oom scoring has other downsides, and most of us agree
> that they aren't preferable over the differences in scoring the root
> vs scoring other cgroups - in particular because the root cannot be
> controlled, doesn't even have local statistics, and so is unlikely to
> contain important work on a containerized system. Getting the ballpark
> right for the vast majority of usecases is more than good enough here.
>
>> Since the ability of userspace to control oom victim selection is not 
>> addressed whatsoever by this patchset, and the suggested method cannot be 
>> implemented on top of this patchset as you have argued because it requires 
>> a change to the heuristic itself, the patchset needs to become complete 
>> before being mergeable.
> It is complete. It just isn't a drop-in replacement for what you've
> been doing out-of-tree for years. Stop making your problem everybody
> else's problem.
>
> You can change the the heuristics later, as you have done before. Or
> you can add another configuration flag and we can phase out the old
> mode, like we do all the time.
>
I think this problem is related to the removal of the lowmemorykiller,
where this is the life-line when the user-space for some reason fails.

So I guess quite a few will have this problem.