[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <311d5313-5a51-fded-714b-420ba3f6a879@sonymobile.com>
Date: Tue, 31 Oct 2017 16:07:42 +0100
From: peter enderborg <peter.enderborg@...ymobile.com>
To: Michal Hocko <mhocko@...nel.org>
CC: Johannes Weiner <hannes@...xchg.org>,
David Rientjes <rientjes@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
<linux-mm@...ck.org>, Vladimir Davydov <vdavydov.dev@...il.com>,
Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
Tejun Heo <tj@...nel.org>, <kernel-team@...com>,
<cgroups@...r.kernel.org>, <linux-doc@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, Roman Gushchin <guro@...com>
Subject: Re: [RESEND v12 0/6] cgroup-aware OOM killer
On 10/31/2017 03:34 PM, Michal Hocko wrote:
> On Tue 31-10-17 15:17:11, peter enderborg wrote:
>> On 10/27/2017 10:05 PM, Johannes Weiner wrote:
>>> On Thu, Oct 26, 2017 at 02:03:41PM -0700, David Rientjes wrote:
>>>> On Thu, 26 Oct 2017, Johannes Weiner wrote:
>>>>
>>>>>> The nack is for three reasons:
>>>>>>
>>>>>> (1) unfair comparison of root mem cgroup usage to bias against that mem
>>>>>> cgroup from oom kill in system oom conditions,
>>>>>>
>>>>>> (2) the ability of users to completely evade the oom killer by attaching
>>>>>> all processes to child cgroups either purposefully or unpurposefully,
>>>>>> and
>>>>>>
>>>>>> (3) the inability of userspace to effectively control oom victim
>>>>>> selection.
>>>>> My apologies if my summary was too reductionist.
>>>>>
>>>>> That being said, the arguments you repeat here have come up in
>>>>> previous threads and been responded to. This doesn't change my
>>>>> conclusion that your NAK is bogus.
>>>> They actually haven't been responded to, Roman was working through v11 and
>>>> made a change on how the root mem cgroup usage was calculated that was
>>>> better than previous iterations but still not an apples to apples
>>>> comparison with other cgroups. The problem is that it the calculation for
>>>> leaf cgroups includes additional memory classes, so it biases against
>>>> processes that are moved to non-root mem cgroups. Simply creating mem
>>>> cgroups and attaching processes should not independently cause them to
>>>> become more preferred: it should be a fair comparison between the root mem
>>>> cgroup and the set of leaf mem cgroups as implemented. That is very
>>>> trivial to do with hierarchical oom cgroup scoring.
>>> There is absolutely no value in your repeating the same stuff over and
>>> over again without considering what other people are telling you.
>>>
>>> Hierarchical oom scoring has other downsides, and most of us agree
>>> that they aren't preferable over the differences in scoring the root
>>> vs scoring other cgroups - in particular because the root cannot be
>>> controlled, doesn't even have local statistics, and so is unlikely to
>>> contain important work on a containerized system. Getting the ballpark
>>> right for the vast majority of usecases is more than good enough here.
>>>
>>>> Since the ability of userspace to control oom victim selection is not
>>>> addressed whatsoever by this patchset, and the suggested method cannot be
>>>> implemented on top of this patchset as you have argued because it requires
>>>> a change to the heuristic itself, the patchset needs to become complete
>>>> before being mergeable.
>>> It is complete. It just isn't a drop-in replacement for what you've
>>> been doing out-of-tree for years. Stop making your problem everybody
>>> else's problem.
>>>
>>> You can change the the heuristics later, as you have done before. Or
>>> you can add another configuration flag and we can phase out the old
>>> mode, like we do all the time.
>>>
>> I think this problem is related to the removal of the lowmemorykiller,
>> where this is the life-line when the user-space for some reason fails.
>>
>> So I guess quite a few will have this problem.
> Could you be more specific please? We are _not_ removing possibility of
> the user space influenced oom victim selection. You can still use the
> _current_ oom selection heuristic. The patch adds a new selection method
> which is opt-in so only those who want to opt-in will not be allowed to
> have any influence on the victim selection. And as it has been pointed
> out this can be implemented later so it is not like "this won't be
> possible anymore in future"
I think the idea is to have a implementation that is lowmemorykiller selection heuristic.
Powered by blists - more mailing lists