lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 21 Jan 2014 12:41:41 -0800 (PST)
From:	David Rientjes <rientjes@...gle.com>
To:	Jianguo Wu <wujianguo@...wei.com>
cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Rik van Riel <riel@...hat.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [question] how to figure out OOM reason? should dump slab/vmalloc
 info when OOM?

On Tue, 21 Jan 2014, Jianguo Wu wrote:

> > The problem is that slabinfo becomes excessively verbose and dumping it 
> > all to the kernel log often times causes important messages to be lost.  
> > This is why we control things like the tasklist dump with a VM sysctl.  It 
> > would be possible to dump, say, the top ten slab caches with the highest 
> > memory usage, but it will only be helpful for slab leaks.  Typically there 
> > are better debugging tools available than analyzing the kernel log; if you 
> > see unusually high slab memory in the meminfo dump, you can enable it.
> > 
> 
> But, when OOM has happened, we can only use kernel log, slab/vmalloc info from proc
> is stale. Maybe we can dump slab/vmalloc with a VM sysctl, and only top 10/20 entrys?
> 

You could, but it's a tradeoff between how much to dump to a general 
resource such as the kernel log and how many sysctls we add that control 
every possible thing.  Slab leaks would definitely be a minority of oom 
conditions and you should normally be able to reproduce them by running 
the same workload; just use slabtop(1) or manually inspect /proc/slabinfo 
while such a workload is running for indicators.  I don't think we want to 
add the information by default, though, nor do we want to add sysctls to 
control the behavior (you'd still need to reproduce the issue after 
enabling it).

We are currently discussing userspace oom handlers, though, that would 
allow you to run a process that would be notified and allowed to allocate 
a small amount of memory on oom conditions.  It would then be trivial to 
dump any information you feel pertinent in userspace prior to killing 
something.  I like to inspect heap profiles for memory hogs while 
debugging our malloc() issues, for example, and you could look more 
closely at kernel memory.

I'll cc you on future discussions of that feature.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists