linux-kernel - Re: [PATCH 0/10 -v3] Handle oom bypass more gracefully

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160608160509.GA21838@dhcp22.suse.cz>
Date:	Wed, 8 Jun 2016 18:05:09 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:	linux-mm@...ck.org, rientjes@...gle.com, oleg@...hat.com,
	vdavydov@...allels.com, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/10 -v3] Handle oom bypass more gracefully

On Wed 08-06-16 23:55:24, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Wed 08-06-16 06:49:24, Tetsuo Handa wrote:
> > > Michal Hocko wrote:
> > > > OK, so you are arming the timer for each mark_oom_victim regardless
> > > > of the oom context. This means that you have replaced one potential
> > > > lockup by other potential livelocks. Tasks from different oom domains
> > > > might interfere here...
> > > > 
> > > > Also this code doesn't even seem easier. It is surely less lines of
> > > > code but it is really hard to realize how would the timer behave for
> > > > different oom contexts.
> > > 
> > > If you worry about interference, we can use per signal_struct timestamp.
> > > I used per task_struct timestamp in my earlier versions (where per
> > > task_struct TIF_MEMDIE check was used instead of per signal_struct
> > > oom_victims).
> > 
> > This would allow pre-mature new victim selection for very large victims
> > (note that exit_mmap can take a while depending on the mm size). It also
> > pushed the timeout heuristic for everybody which will sooner or later
> > open a question why is this $NUMBER rathen than $NUMBER+$FOO.
> 
> You are again worrying about wrong problem. You are ignoring distinction
> between genuine lock up (real problem for you) and effectively locked up
> (real problem for administrators).

No, I just do care more about a sane and consistent behavior rather than
a random one which is inherent to timeout based solutions.

[...]
> > To be honest I would rather explore ways to handle kthread case (which
> > is the only real one IMHO from the two) gracefully and made them a
> > nonissue - e.g. enforce EFAULT on a dead mm during the kthread page fault
> > or something similar.
> 
> You are always living in a world with plenty resource. You tend to ignore
> CONFIG_MMU=n kernels.

You keep repeating !CONFIG_MMU case but never shown a single evidence
this is an actual problem for those platforms. If this turns out to
be a real problem I am willing to spend time on it and try to find a
solution.

I am sorry, I have left most of your email without any reaction. I
just don't believe repeating the same arguments is anyhow useful or
productive. Quite some time ago I've told that I strongly believe that
doing any timeout based heuristic should be only the last resort when we
fail all other ways to mitigate the issue. I think I've shown that there
is a path, which btw. doesn't add too much code into the oom path. All
the patches are self rather small incrementally improve the situation. I
have already told you that you, as a reviewer, are free to nack those
patches if you believe they are incorrect, unmaintainable or actively
harmful to the common case. Unless you do so with a proper justification
I will not react to any timeout based solutions.

I will post the full series with all the remaining fixups tomorrow and
let all the reviewers to judge. I strongly believe that it puts some
order to the code and makes it easier to reason about. If this view
is not shared by others I can back off and not pursue this path any
longer.
-- 
Michal Hocko
SUSE Labs