[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170913122309.dsnbt3t3m5sa7qgk@dhcp22.suse.cz>
Date: Wed, 13 Sep 2017 14:23:09 +0200
From: Michal Hocko <mhocko@...nel.org>
To: Roman Gushchin <guro@...com>
Cc: David Rientjes <rientjes@...gle.com>, linux-mm@...ck.org,
Vladimir Davydov <vdavydov.dev@...il.com>,
Johannes Weiner <hannes@...xchg.org>,
Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
Andrew Morton <akpm@...ux-foundation.org>,
Tejun Heo <tj@...nel.org>, kernel-team@...com,
cgroups@...r.kernel.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [v8 3/4] mm, oom: add cgroup v2 mount option for cgroup-aware
OOM killer
On Tue 12-09-17 21:01:15, Roman Gushchin wrote:
> On Mon, Sep 11, 2017 at 01:48:39PM -0700, David Rientjes wrote:
> > On Mon, 11 Sep 2017, Roman Gushchin wrote:
> >
> > > Add a "groupoom" cgroup v2 mount option to enable the cgroup-aware
> > > OOM killer. If not set, the OOM selection is performed in
> > > a "traditional" per-process way.
> > >
> > > The behavior can be changed dynamically by remounting the cgroupfs.
> >
> > I can't imagine that Tejun would be happy with a new mount option,
> > especially when it's not required.
> >
> > OOM behavior does not need to be defined at mount time and for the entire
> > hierarchy. It's possible to very easily implement a tunable as part of
> > mem cgroup that is propagated to descendants and controls the oom scoring
> > behavior for that hierarchy. It does not need to be system wide and
> > affect scoring of all processes based on which mem cgroup they are
> > attached to at any given time.
>
> No, I don't think that mixing per-cgroup and per-process OOM selection
> algorithms is a good idea.
>
> So, there are 3 reasonable options:
> 1) boot option
> 2) sysctl
> 3) cgroup mount option
>
> I believe, 3) is better, because it allows changing the behavior dynamically,
> and explicitly depends on v2 (what sysctl lacks).
I see your argument here. I would just be worried that we end up really
needing more oom strategies in future and those wouldn't fit into memcg
mount option scope. So 1/2 sounds more exensible to me long term. Boot
time would be easier because we do not have to bother dynamic selection
in that case.
> So, the only question is should it be opt-in or opt-out option.
> Personally, I would prefer opt-out, but Michal has a very strong opinion here.
Yes I still strongly believe this has to be opt-in.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists