lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170310102756.GE3753@dhcp22.suse.cz>
Date:   Fri, 10 Mar 2017 11:27:56 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Rik van Riel <riel@...hat.com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mel Gorman <mgorman@...e.de>, Vlastimil Babka <vbabka@...e.cz>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        linux-mm@...ck.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm, vmscan: do not loop on too_many_isolated for ever

On Thu 09-03-17 17:18:00, Rik van Riel wrote:
> On Thu, 2017-03-09 at 13:05 -0500, Johannes Weiner wrote:
> > On Tue, Mar 07, 2017 at 02:52:36PM -0500, Rik van Riel wrote:
> > > 
> > > It only does this to some extent.  If reclaim made
> > > no progress, for example due to immediately bailing
> > > out because the number of already isolated pages is
> > > too high (due to many parallel reclaimers), the code
> > > could hit the "no_progress_loops > MAX_RECLAIM_RETRIES"
> > > test without ever looking at the number of reclaimable
> > > pages.
> > Hm, there is no early return there, actually. We bump the loop
> > counter
> > every time it happens, but then *do* look at the reclaimable pages.
> 
> Am I looking at an old tree?  I see this code
> before we look at the reclaimable pages.
> 
>         /*
>          * Make sure we converge to OOM if we cannot make any progress
>          * several times in the row.
>          */
>         if (*no_progress_loops > MAX_RECLAIM_RETRIES) {
>                 /* Before OOM, exhaust highatomic_reserve */
>                 return unreserve_highatomic_pageblock(ac, true);
>         }

I believe that Johannes meant cases where we do not exhaust all the
reclaim retries and fail early because there are no reclaimable pages
during the watermark check.

> > > Could that create problems if we have many concurrent
> > > reclaimers?
> > With increased concurrency, the likelihood of OOM will go up if we
> > remove the unlimited wait for isolated pages, that much is true.
> > 
> > I'm not sure that's a bad thing, however, because we want the OOM
> > killer to be predictable and timely. So a reasonable wait time in
> > between 0 and forever before an allocating thread gives up under
> > extreme concurrency makes sense to me.
> 
> That is a fair point, a faster OOM kill is preferable
> to a system that is livelocked.
> 
> > Unless I'm mistaken, there doesn't seem to be a whole lot of urgency
> > behind this patch. Can we think about a general model to deal with
> > allocation concurrency? Unlimited parallel direct reclaim is kinda
> > bonkers in the first place. How about checking for excessive
> > isolation
> > counts from the page allocator and putting allocations on a
> > waitqueue?
> 
> The (limited) number of reclaimers can still do a
> relatively fast OOM kill, if none of them manage
> to make progress.

well, we can estimate how much memory can those relatively few
reclaimers isolate and try to reclaim. Even if we have hundreds of them which
is more towards a large number to me then we are 100*SWAP_CLUSTER_MAX
which is not all that much. And we are effectivelly OOM if there is no
other reclaimable memory left. All we need is just to put some upper
bound. We already have throttle_direct_reclaim but it doesn't really
throttle the maximum number of reclaimers.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ