[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170123110412.z5e372pabnnezdnu@techsingularity.net>
Date: Mon, 23 Jan 2017 11:04:12 +0000
From: Mel Gorman <mgorman@...hsingularity.net>
To: Trevor Cordes <trevor@...nopolis.ca>
Cc: Michal Hocko <mhocko@...nel.org>, linux-kernel@...r.kernel.org,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Minchan Kim <minchan@...nel.org>,
Rik van Riel <riel@...riel.com>,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
Subject: Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)
On Mon, Jan 23, 2017 at 10:48:58AM +0000, Mel Gorman wrote:
> On Sun, Jan 22, 2017 at 06:45:59PM -0600, Trevor Cordes wrote:
> > On 2017-01-20 Mel Gorman wrote:
> > > >
> > > > Thanks for the OOM report. I was expecting it to be a particular
> > > > shape and my expectations were not matched so it took time to
> > > > consider it further. Can you try the cumulative patch below? It
> > > > combines three patches that
> > > >
> > > > 1. Allow slab shrinking even if the LRU patches are unreclaimable in
> > > > direct reclaim
> > > > 2. Shrinks slab based once based on the contents of all memcgs
> > > > instead of shrinking one at a time
> > > > 3. Tries to shrink slabs if the lowmem usage is too high
> > > >
> > > > Unfortunately it's only boot tested on x86-64 as I didn't get the
> > > > chance to setup an i386 test bed.
> > > >
> > >
> > > There was one major flaw in that patch. This version fixes it and
> > > addresses other minor issues. It may still be too agressive shrinking
> > > slab but worth trying out. Thanks.
> >
> > I ran with your patch below and it oom'd on the first night. It was
> > weird, it didn't hang the system, and my rebooter script started a
> > reboot but the system never got more than half down before it just sat
> > there in a weird state where a local console user could still login but
> > not much was working. So the patches don't seem to solve the problem.
> >
> > For the above compile I applied your patches to 4.10.0-rc4+, I hope
> > that's ok.
> >
>
> It would be strongly preferred to run them on top of Michal's other
> fixes. The main reason it's preferred is because this OOM differs from
> earlier ones in that it OOM killed from GFP_NOFS|__GFP_NOFAIL context.
> That meant that the slab shrinking could not happen from direct reclaim so
> the balancing from my patches would not occur. As Michal's other patches
> affect how kswapd behaves, it's important.
>
> Unfortunately, even that will be race prone for GFP_NOFS callers as
> they'll effectively be racing to see if kswapd or another direct
> reclaimer can reclaim before the OOM conditions are hit. It is by
> design, but it's apparent that a __GFP_NOFAIL request can trigger OOM
> relatively easily as it's not necessarily throttled or waiting on kswapd
> to complete any work. I'll keep thinking about it.
>
As a slight follow-up albeit without patches, further options are to;
1. In should_reclaim_retry, account for SLAB_RECLAIMABLE as available
pages when deciding to retry reclaim
2. Stall in should_reclaim_retry for __GFP_NOFAIL|__GFP_NOFS with a
comment stating that the intent is to allow kswapd make progress
with the shrinker
3. Stall __GFP_NOFS in direct reclaimer on a workqueue when it's
failing to make progress to allow kswapd to do some work. This
may be impaired if kswapd is locked up waiting for a lock held
by the direct reclaimer
4. Schedule the system workqueue to drain slab for
__GFP_NOFS|__GFP_NOFAIL.
3 and 4 are extremely heavy handed so we should try them one at a time.
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists