linux-kernel - Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170123110412.z5e372pabnnezdnu@techsingularity.net>
Date:   Mon, 23 Jan 2017 11:04:12 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Trevor Cordes <trevor@...nopolis.ca>
Cc:     Michal Hocko <mhocko@...nel.org>, linux-kernel@...r.kernel.org,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Minchan Kim <minchan@...nel.org>,
        Rik van Riel <riel@...riel.com>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
Subject: Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

On Mon, Jan 23, 2017 at 10:48:58AM +0000, Mel Gorman wrote:
> On Sun, Jan 22, 2017 at 06:45:59PM -0600, Trevor Cordes wrote:
> > On 2017-01-20 Mel Gorman wrote:
> > > > 
> > > > Thanks for the OOM report. I was expecting it to be a particular
> > > > shape and my expectations were not matched so it took time to
> > > > consider it further. Can you try the cumulative patch below? It
> > > > combines three patches that
> > > > 
> > > > 1. Allow slab shrinking even if the LRU patches are unreclaimable in
> > > >    direct reclaim
> > > > 2. Shrinks slab based once based on the contents of all memcgs
> > > > instead of shrinking one at a time
> > > > 3. Tries to shrink slabs if the lowmem usage is too high
> > > > 
> > > > Unfortunately it's only boot tested on x86-64 as I didn't get the
> > > > chance to setup an i386 test bed.
> > > >   
> > > 
> > > There was one major flaw in that patch. This version fixes it and
> > > addresses other minor issues. It may still be too agressive shrinking
> > > slab but worth trying out. Thanks.
> > 
> > I ran with your patch below and it oom'd on the first night.  It was
> > weird, it didn't hang the system, and my rebooter script started a
> > reboot but the system never got more than half down before it just sat
> > there in a weird state where a local console user could still login but
> > not much was working.  So the patches don't seem to solve the problem.
> > 
> > For the above compile I applied your patches to 4.10.0-rc4+, I hope
> > that's ok.
> > 
> 
> It would be strongly preferred to run them on top of Michal's other
> fixes. The main reason it's preferred is because this OOM differs from
> earlier ones in that it OOM killed from GFP_NOFS|__GFP_NOFAIL context.
> That meant that the slab shrinking could not happen from direct reclaim so
> the balancing from my patches would not occur.  As Michal's other patches
> affect how kswapd behaves, it's important.
> 
> Unfortunately, even that will be race prone for GFP_NOFS callers as
> they'll effectively be racing to see if kswapd or another direct
> reclaimer can reclaim before the OOM conditions are hit. It is by
> design, but it's apparent that a __GFP_NOFAIL request can trigger OOM
> relatively easily as it's not necessarily throttled or waiting on kswapd
> to complete any work. I'll keep thinking about it.
> 

As a slight follow-up albeit without patches, further options are to;

1. In should_reclaim_retry, account for SLAB_RECLAIMABLE as available
   pages when deciding to retry reclaim
2. Stall in should_reclaim_retry for __GFP_NOFAIL|__GFP_NOFS with a
   comment stating that the intent is to allow kswapd make progress
   with the shrinker
3. Stall __GFP_NOFS in direct reclaimer on a workqueue when it's
   failing to make progress to allow kswapd to do some work. This
   may be impaired if kswapd is locked up waiting for a lock held
   by the direct reclaimer
4. Schedule the system workqueue to drain slab for
   __GFP_NOFS|__GFP_NOFAIL.

3 and 4 are extremely heavy handed so we should try them one at a time.

-- 
Mel Gorman
SUSE Labs