linux-kernel - Re: [patch] mm, oom: stop reclaiming if GFP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.22.394.2004271558540.248401@chino.kir.corp.google.com>
Date:   Mon, 27 Apr 2020 16:03:56 -0700 (PDT)
From:   David Rientjes <rientjes@...gle.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
cc:     Vlastimil Babka <vbabka@...e.cz>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [patch] mm, oom: stop reclaiming if GFP_ATOMIC will start failing
 soon

On Mon, 27 Apr 2020, Andrew Morton wrote:

> > No - that would actually make the problem worse.
> > 
> > Today, per-zone min watermarks dictate when user allocations will loop or 
> > oom kill.  should_reclaim_retry() currently loops if reclaim has succeeded 
> > in the past few tries and we should be able to allocate if we are able to 
> > reclaim the amount of memory that we think we can.
> > 
> > The issue is that this supposes that looping to reclaim more will result 
> > in more free memory.  That doesn't always happen if there are concurrent 
> > memory allocators.
> > 
> > GFP_ATOMIC allocators can access below these per-zone watermarks.  So the 
> > issue is that per-zone free pages stays between ALLOC_HIGH watermarks 
> > (the watermark that GFP_ATOMIC allocators can allocate to) and min 
> > watermarks.  We never reclaim enough memory to get back to min watermarks 
> > because reclaim cannot keep up with the amount of GFP_ATOMIC allocations.
> 
> But there should be an upper bound upon the total amount of in-flight
> GFP_ATOMIC memory at any point in time?  These aren't like pagecache
> which will take more if we give it more.  Setting the various
> thresholds appropriately should ensure that blockable allocations don't
> get their memory stolen by GPP_ATOMIC allocations?
> 

Certainly if that upper bound is defined and enforced somewhere we would 
not have run into this issue causing all userspace to become completely 
unresponsive.  Do you have links to patches that proposed enforcing this 
upper bound?  It seems like it would have to be generic to 
__alloc_pages_slowpath() because otherwise multiple different GFP_ATOMIC 
allocators, all from different sources, couldn't orchestrate their memory 
allocations amongst themselves to enforce this upper bound.  They would 
need to work together to ensure they don't conspire to cause this 
depletion.  I'd be happy to take a look if there are links to other 
approaches.