linux-kernel - Re: [patch] mm, oom: stop reclaiming if GFP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.22.394.2004281436280.131129@chino.kir.corp.google.com>
Date:   Tue, 28 Apr 2020 14:48:25 -0700 (PDT)
From:   David Rientjes <rientjes@...gle.com>
To:     Vlastimil Babka <vbabka@...e.cz>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
cc:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Mel Gorman <mgorman@...hsingularity.net>
Subject: Re: [patch] mm, oom: stop reclaiming if GFP_ATOMIC will start failing
 soon

On Tue, 28 Apr 2020, Vlastimil Babka wrote:

> > I took a look at doing a quick-fix for the
> > direct-reclaimers-get-their-stuff-stolen issue about a million years
> > ago.  I don't recall where it ended up.  It's pretty trivial for the
> > direct reclaimer to free pages into current->reclaimed_pages and to
> > take a look in there on the allocation path, etc.  But it's only
> > practical for order-0 pages.
> 
> FWIW there's already such approach added to compaction by Mel some time ago,
> so order>0 allocations are covered to some extent. But in this case I imagine
> that compaction won't even start because order-0 watermarks are too low.
> 
> The order-0 reclaim capture might work though - as a result the GFP_ATOMIC
> allocations would more likely fail and defer to their fallback context.
> 

Yes, order-0 reclaim capture is interesting since the issue being reported 
here is userspace going out to lunch because it loops for an unbounded 
amount of time trying to get above a watermark where it's allowed to 
allocate and other consumers are depleting that resource.

We actually prefer to oom kill earlier rather than being put in a 
perpetual state of aggressive reclaim that affects all allocators and the 
unbounded nature of those allocations leads to very poor results for 
everybody.

I'm happy to scope this solely to an order-0 reclaim capture.  I'm not 
sure if I'm clear on whether this has been worked on before and patches 
existed in the past?

Somewhat related to what I described in the changelog: we lost the "page 
allocation stalls" artifacts in the kernel log for 4.15.  The commit 
description references an asynchronous mechanism for getting this 
information; I don't know where this mechanism currently lives.