linux-kernel - Re: [RFC -v2] panic_on_oom

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150617154159.GJ25056@dhcp22.suse.cz>
Date:	Wed, 17 Jun 2015 17:41:59 +0200
From:	Michal Hocko <mhocko@...e.cz>
To:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:	linux-mm@...ck.org, rientjes@...gle.com, hannes@...xchg.org,
	tj@...nel.org, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC -v2] panic_on_oom_timeout

On Wed 17-06-15 22:59:54, Tetsuo Handa wrote:
> Michal Hocko wrote:
[...]
> > But you have a point that we could have
> > - constrained OOM which elevates oom_victims
> > - global OOM killer strikes but wouldn't start the timer
> > 
> > This is certainly possible and timer_pending(&panic_on_oom) replacing
> > oom_victims check should help here. I will think about this some more.
> 
> Yes, please.

Fixed in my local version. I will post the new version of the patch
after we settle with the approach.
 
> > The important thing is to decide what is the reasonable way forward. We
> > have two two implementations of panic based timeout. So we should decide
> > - Should we add a panic timeout at all?
> > - Should be the timeout bound to panic_on_oom?
> > - Should we care about constrained OOM contexts?
> > - If yes should they use the same timeout?
> > - If no should each memcg be able to define its own timeout?
> > 
> Exactly.
> 
> > My thinking is that it should be bound to panic_on_oom=1 only until we
> > hear from somebody actually asking for a constrained oom and even then
> > do not allow for too large configuration space (e.g. no per-memcg
> > timeout) or have separate mempolicy vs. memcg timeouts.
> 
> My implementation comes from providing debugging hints when analyzing
> vmcore of a stalled system. I'm posting logs of stalls after global OOM
> killer struck because it is easy to reproduce. But what I have problem
> is when a system stalled before the OOM killer strikes (I saw many cases
> for customer's enterprise servers), for we don't have hints for guessing
> whether memory allocator is the cause or not. Thus, my version tried to
> emit warning messages using sysctl_memalloc_task_warn_secs .

I can understand your frustration but I believe that a debugability is
a separate topic and we should start by defining a reasonable _policy_
so that an administrator has a way to handle potential OOM stalls
reasonably and with a well defined semantic.

> Ability to take care of constrained OOM contexts is a side effect of use of
> per a "struct task_struct" variable. Even if we come to a conclusion that
> we should not add a timeout for panic, I hope that a timeout for warning
> about memory allocation stalls is added.
> 
> > Let's start simple and make things more complicated later!
> 
> I think we mismatch about what the timeout counters are for.

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/