[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171003144308.zc5xaxmgbmiz2jvg@dhcp22.suse.cz>
Date: Tue, 3 Oct 2017 16:43:08 +0200
From: Michal Hocko <mhocko@...nel.org>
To: Roman Gushchin <guro@...com>
Cc: linux-mm@...ck.org, Vladimir Davydov <vdavydov.dev@...il.com>,
Johannes Weiner <hannes@...xchg.org>,
Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
David Rientjes <rientjes@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Tejun Heo <tj@...nel.org>, kernel-team@...com,
cgroups@...r.kernel.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [v9 3/5] mm, oom: cgroup-aware OOM killer
On Tue 03-10-17 15:38:08, Roman Gushchin wrote:
> On Tue, Oct 03, 2017 at 04:22:46PM +0200, Michal Hocko wrote:
> > On Tue 03-10-17 15:08:41, Roman Gushchin wrote:
> > > On Tue, Oct 03, 2017 at 03:36:23PM +0200, Michal Hocko wrote:
> > [...]
> > > > I guess we want to inherit the value on the memcg creation but I agree
> > > > that enforcing parent setting is weird. I will think about it some more
> > > > but I agree that it is saner to only enforce per memcg value.
> > >
> > > I'm not against, but we should come up with a good explanation, why we're
> > > inheriting it; or not inherit.
> >
> > Inheriting sounds like a less surprising behavior. Once you opt in for
> > oom_group you can expect that descendants are going to assume the same
> > unless they explicitly state otherwise.
> >
> > [...]
> > > > > > > @@ -962,6 +968,48 @@ static void oom_kill_process(struct oom_control *oc, const char *message)
> > > > > > > __oom_kill_process(victim);
> > > > > > > }
> > > > > > >
> > > > > > > +static int oom_kill_memcg_member(struct task_struct *task, void *unused)
> > > > > > > +{
> > > > > > > + if (!tsk_is_oom_victim(task)) {
> > > > > >
> > > > > > How can this happen?
> > > > >
> > > > > We do start with killing the largest process, and then iterate over all tasks
> > > > > in the cgroup. So, this check is required to avoid killing tasks which are
> > > > > already in the termination process.
> > > >
> > > > Do you mean we have tsk_is_oom_victim && MMF_OOM_SKIP == T?
> > >
> > > No, just tsk_is_oom_victim. We're are killing the biggest task, and then _all_
> > > tasks. This is a way to skip the biggest task, and do not kill it again.
> >
> > OK, I have missed that part. Why are we doing that actually? Why don't
> > we simply do
> > /* If oom_group flag is set, kill all belonging tasks */
> > if (mem_cgroup_oom_group(oc->chosen_memcg))
> > mem_cgroup_scan_tasks(oc->chosen_memcg, oom_kill_memcg_member,
> > NULL);
> >
> > we are going to kill all the tasks anyway.
>
> Well, the idea behind was that killing the biggest process give us better
> chances to get out of global memory shortage and guarantee forward progress.
> I can drop it, if it considered to be excessive.
Yes, please do so. If we need it then we can do that in a separate patch
along with the explanation why it is needed.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists