linux-kernel - Re: [PATCH] mm, vmscan: do not loop on too_many

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <201703102044.DBJ04626.FLVMFOQOJtOFHS@I-love.SAKURA.ne.jp>
Date:   Fri, 10 Mar 2017 20:44:38 +0900
From:   Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To:     mhocko@...nel.org, hannes@...xchg.org
Cc:     riel@...hat.com, akpm@...ux-foundation.org, mgorman@...e.de,
        vbabka@...e.cz, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm, vmscan: do not loop on too_many_isolated for ever

Michal Hocko wrote:
> On Thu 09-03-17 13:05:40, Johannes Weiner wrote:
> > On Tue, Mar 07, 2017 at 02:52:36PM -0500, Rik van Riel wrote:
> > > It only does this to some extent.  If reclaim made
> > > no progress, for example due to immediately bailing
> > > out because the number of already isolated pages is
> > > too high (due to many parallel reclaimers), the code
> > > could hit the "no_progress_loops > MAX_RECLAIM_RETRIES"
> > > test without ever looking at the number of reclaimable
> > > pages.
> > 
> > Hm, there is no early return there, actually. We bump the loop counter
> > every time it happens, but then *do* look at the reclaimable pages.
> > 
> > > Could that create problems if we have many concurrent
> > > reclaimers?
> > 
> > With increased concurrency, the likelihood of OOM will go up if we
> > remove the unlimited wait for isolated pages, that much is true.
> > 
> > I'm not sure that's a bad thing, however, because we want the OOM
> > killer to be predictable and timely. So a reasonable wait time in
> > between 0 and forever before an allocating thread gives up under
> > extreme concurrency makes sense to me.
> > 
> > > It may be OK, I just do not understand all the implications.
> > > 
> > > I like the general direction your patch takes the code in,
> > > but I would like to understand it better...
> > 
> > I feel the same way. The throttling logic doesn't seem to be very well
> > thought out at the moment, making it hard to reason about what happens
> > in certain scenarios.
> > 
> > In that sense, this patch isn't really an overall improvement to the
> > way things work. It patches a hole that seems to be exploitable only
> > from an artificial OOM torture test, at the risk of regressing high
> > concurrency workloads that may or may not be artificial.
> > 
> > Unless I'm mistaken, there doesn't seem to be a whole lot of urgency
> > behind this patch. Can we think about a general model to deal with
> > allocation concurrency? 
> 
> I am definitely not against. There is no reason to rush the patch in.

I don't hurry if we can check using watchdog whether this problem is occurring
in the real world. I have to test corner cases because watchdog is missing.

> My main point behind this patch was to reduce unbound loops from inside
> the reclaim path and push any throttling up the call chain to the
> page allocator path because I believe that it is easier to reason
> about them at that level. The direct reclaim should be as simple as
> possible without too many side effects otherwise we end up in a highly
> unpredictable behavior. This was a first step in that direction and my
> testing so far didn't show any regressions.
> 
> > Unlimited parallel direct reclaim is kinda
> > bonkers in the first place. How about checking for excessive isolation
> > counts from the page allocator and putting allocations on a waitqueue?
> 
> I would be interested in details here.

That will help implementing __GFP_KILLABLE.
https://bugzilla.kernel.org/show_bug.cgi?id=192981#c15