linux-kernel - Re: [patch 08/11 -mmotm] oom: invoke oom killer for __GFP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.0905111159130.23739@chino.kir.corp.google.com>
Date:	Mon, 11 May 2009 12:09:57 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Dave Hansen <dave@...ux.vnet.ibm.com>
cc:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Greg Kroah-Hartman <gregkh@...e.de>,
	Nick Piggin <npiggin@...e.de>, Mel Gorman <mel@....ul.ie>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Christoph Lameter <cl@...ux-foundation.org>,
	San Mehat <san@...roid.com>, Arve Hjonnevag <arve@...roid.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [patch 08/11 -mmotm] oom: invoke oom killer for __GFP_NOFAIL

On Mon, 11 May 2009, Dave Hansen wrote:

> Could you explain a little more about why you think this scenario works
> for you?  Are large contiguous areas of memory pinned by the task
> getting which you want to get killed?  Why wasn't swapping effective
> against this task?  Was the task itself taking up a large portion of
> total memory?
> 

We frequently do cpuset-constrained oom kills where the lionshare of 
memory on a set of nodes is allocated by a single task or a group of 
threads all sharing the same memory.  Swapping is largely effective but at 
this point in the code path it's obviously not making any progress in 
freeing pages.  So this change fixes two issues:

 - __GFP_NOFAIL allocations should not be allowed to return NULL, and

 - we should prevent looping endlessly in the page allocator if reclaim
   cannot free the requisite amount of memory.

There is no reason that the oom killer would not be able to kill a task 
that could free 64K of contiguous memory, especially for those that 
mlock() their memory.  You could argue that any __GFP_NOFAIL allocation 
above order 3 is insane and should not kill tasks, but that's an issue 
higher up the stack.  If you'd like to identify such instances, we could 
emit a warning message here and a stack trace.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/