linux-kernel - Re: [PATCH 1/5] mm: Add __GFP_NO_OOM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.0905071508190.2164@chino.kir.corp.google.com>
Date:	Thu, 7 May 2009 15:16:17 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
cc:	rjw@...k.pl, fengguang.wu@...el.com,
	linux-pm@...ts.linux-foundation.org, pavel@....cz,
	torvalds@...ux-foundation.org, jens.axboe@...cle.com,
	alan-jenkins@...fmail.co.uk, linux-kernel@...r.kernel.org,
	kernel-testers@...r.kernel.org
Subject: Re: [PATCH 1/5] mm: Add __GFP_NO_OOM_KILL flag

On Thu, 7 May 2009, Andrew Morton wrote:

> - the standard way of controlling memory allocator behaviour is via
>   the gfp_t.  Bypassing that is an unusual step and needs a higher
>   level of justification, which I'm not seeing here.
> 

The standard way of controlling the oom killer behavior for a zone is via 
the ZONE_OOM_LOCKED bit.

> - if we do this via an unusual global, we reduce the chances that
>   another subsytem could use the new feature.
> 
>   I don't know what subsytem that might be, but I bet they're out
>   there.  checkpoint-restart, virtual machines, ballooning memory
>   drivers, kexec loading, etc.
> 

There's two separate issues here: the use of ZONE_OOM_LOCKED to control 
whether or not to invoke the oom killer for a specific zone (which is 
already its only function), and the fact that in this case we're doing it 
for all zones.  It seems like you're concerned with the latter, but the 
distinction in the hibernation case is that no memory freeing would be 
possible as the result of the oom killer for _all_ zones, so it makes 
sense to lock them all out.

> > The fact is that _all_ allocations here are implicitly __GFP_NO_OOM_KILL 
> > whether it specifies it or not since the oom killer would simply kill a 
> > task in D state which can't exit or free memory and subsequent allocations 
> > would make the oom killer a no-op because there's an eligible task with 
> > TIF_MEMDIE set.  The only thing you're saving with __GFP_NO_OOM_KILL is 
> > calling the oom killer in a first place and killing an unresponsive task 
> > but that would have to happen anyway when thawed since the system is oom 
> > (or otherwise lockup for GFP_KERNEL with order < PAGE_ALLOC_COSTLY_ORDER).
> 
> All the above is specific to the PM application only, when userspace
> tasks are stopped.
> 

I'm not arguing that the only way we can ever implement __GFP_NO_OOM_KILL 
is for the entire system: we can set ZONE_OOM_LOCKED for only the zones in 
the zonelist that are passed to the page allocator.  For this particular 
purpose, that is naturally all zones; for other future use cases it may be 
chosen only to lock out the zones we're allowed to allocate from in that 
context.

> It might well end up that stopping userspace (beforehand or before
> oom-killing) is a hard requirement for reliably disabling the
> oom-killer.

Yes, globally, but future use cases may disable only specific zones such 
as with memory hot-remove.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/