linux-kernel - Re: [patch 3/3 -mmotm] oom: invoke oom killer for __GFP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.0906020016060.24915@chino.kir.corp.google.com>
Date:	Tue, 2 Jun 2009 00:26:50 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
cc:	Nick Piggin <npiggin@...e.de>, Rik van Riel <riel@...hat.com>,
	Mel Gorman <mel@....ul.ie>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Dave Hansen <dave@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [patch 3/3 -mmotm] oom: invoke oom killer for __GFP_NOFAIL

On Mon, 1 Jun 2009, Andrew Morton wrote:

> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1547,7 +1547,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
> >  		goto out;
> >  
> >  	/* The OOM killer will not help higher order allocs */
> > -	if (order > PAGE_ALLOC_COSTLY_ORDER)
> > +	if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_NOFAIL))
> >  		goto out;
> >  
> >  	/* Exhausted what can be done so it's blamo time */
> > @@ -1765,11 +1765,13 @@ rebalance:
> >  				goto got_pg;
> >  
> >  			/*
> > -			 * The OOM killer does not trigger for high-order allocations
> > -			 * but if no progress is being made, there are no other
> > -			 * options and retrying is unlikely to help
> > +			 * The OOM killer does not trigger for high-order
> > +			 * ~__GFP_NOFAIL allocations so if no progress is being
> > +			 * made, there are no other options and retrying is
> > +			 * unlikely to help.
> >  			 */
> > -			if (order > PAGE_ALLOC_COSTLY_ORDER)
> > +			if (order > PAGE_ALLOC_COSTLY_ORDER &&
> > +						!(gfp_mask & __GFP_NOFAIL))
> >  				goto nopage;
> >  
> >  			goto restart;
> 
> I really think/hope/expect that this is unneeded.
> 
> Do we know of any callsites which do greater-than-order-0 allocations
> with GFP_NOFAIL?  If so, we should fix them.
> 
> Then just ban order>0 && GFP_NOFAIL allocations.
> 

That seems like a different topic: banning higher-order __GFP_NOFAIL 
allocations or just deprecating __GFP_NOFAIL altogether and slowly 
switching users over is a worthwhile effort, but is unrelated.

This patch is necessary because we explicitly deny the oom killer from 
being used when the order is greater than PAGE_ALLOC_COSTLY_ORDER because 
of an assumption that it won't help.  That assumption isn't always true, 
especially for large memory-hogging tasks that have mlocked large chunks 
of contiguous memory, for example.  The only thing we do know is that 
direct reclaim has not made any progress so we're unlikely to get a 
substantial amount of memory freeing in the immediate future.  Such an 
instance will simply loop forever without killing that rogue task for a 
__GFP_NOFAIL allocation.

So while it's better in the long-term to deprecate the flag as much as 
possible and perhaps someday remove it from the page allocator entirely, 
we're faced with the current behavior of either looping endlessly or 
freeing memory so the kernel allocation may succeed when direct reclaim 
has failed, which also makes this a rare instance where the oom killer 
will never needlessly kill a task.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/