lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1510261424290.12408@chino.kir.corp.google.com>
Date:	Mon, 26 Oct 2015 14:38:11 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Aristeu Rozanski <arozansk@...hat.com>
cc:	linux-kernel@...r.kernel.org, Greg Thelen <gthelen@...gle.com>,
	Johannes Weiner <hannes@...xchg.org>, linux-mm@...ck.org,
	cgroups@...r.kernel.org
Subject: Re: [PATCH] oom_kill: add option to disable dump_stack()

On Fri, 23 Oct 2015, Aristeu Rozanski wrote:

> One of the largest chunks of log messages in a OOM is from dump_stack() and in
> some cases it isn't even necessary to figure out what's going on. In
> systems with multiple tenants/containers with limited resources each
> OOMs can be way more frequent and being able to reduce the amount of log
> output for each situation is useful.
> 
> This patch adds a sysctl to allow disabling dump_stack() during an OOM while
> keeping the default to behave the same way it behaves today.
> 
> Cc: Greg Thelen <gthelen@...gle.com>
> Cc: Johannes Weiner <hannes@...xchg.org>
> Cc: linux-mm@...ck.org
> Cc: cgroups@...r.kernel.org
> Signed-off-by: Aristeu Rozanski <arozansk@...hat.com>

There's lots of information in the oom log that is irrelevant depending on 
the context in which the oom condition occurred.  Removing the stack trace 
would have made things like commit 9a185e5861e8 ("/proc/stat: convert to 
single_open_size()") harder to fix.  In that case, we were calling the oom 
killer on large file reads from procfs when we could have easily have 
used vmalloc() instead.

When you have a memcg oom kill, the state of the system's memory can 
usually be suppressed because it only occurred because a memcg hierarchy 
reached its limit and has nothing to do with the exhaustion of RAM.

We already control oom output with global sysctls like vm.oom_dump_tasks 
and memcg tunables like memory.oom_verbose.  I'm not sure that adding more 
and more tunables simply to control the oom killer output is in the best 
interest of either procfs or a long-term maintainable kernel.

I can understand the usefulness of having a very small amount of output to 
the kernel log and then enabling tunables to investigate why oom kills are 
happening, but in many situations I've found to only have the oom killer 
output left behind in a kernel log and the situation is not on-going so I 
can't start diagnosing the problem if I don't know what triggered it.

I think adding additional sysctls to control oom killer output is in the 
wrong direction.  I do agree with removing anything that is irrelevant in 
all situations, however.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ