[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.02.1607131004340.31769@file01.intranet.prod.int.rdu2.redhat.com>
Date: Wed, 13 Jul 2016 10:18:35 -0400 (EDT)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Michal Hocko <mhocko@...nel.org>
cc: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
Ondrej Kozina <okozina@...hat.com>,
Jerome Marchand <jmarchan@...hat.com>,
Stanislav Kozina <skozina@...hat.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, David Rientjes <rientjes@...gle.com>
Subject: Re: System freezes after OOM
On Wed, 13 Jul 2016, Michal Hocko wrote:
> [CC David]
>
> > > It is caused by the commit f9054c70d28bc214b2857cf8db8269f4f45a5e23.
> > > Prior to this commit, mempool allocations set __GFP_NOMEMALLOC, so
> > > they never exhausted reserved memory. With this commit, mempool
> > > allocations drop __GFP_NOMEMALLOC, so they can dig deeper (if the
> > > process has PF_MEMALLOC, they can bypass all limits).
> >
> > I wonder whether commit f9054c70d28bc214 ("mm, mempool: only set
> > __GFP_NOMEMALLOC if there are free elements") is doing correct thing.
> > It says
> >
> > If an oom killed thread calls mempool_alloc(), it is possible that
> > it'll
> > loop forever if there are no elements on the freelist since
> > __GFP_NOMEMALLOC prevents it from accessing needed memory reserves in
> > oom conditions.
>
> I haven't studied the patch very deeply so I might be missing something
> but from a quick look the patch does exactly what the above says.
>
> mempool_alloc used to inhibit ALLOC_NO_WATERMARKS by default. David has
> only changed that to allow ALLOC_NO_WATERMARKS if there are no objects
> in the pool and so we have no fallback for the default __GFP_NORETRY
> request.
The swapper core sets the flag PF_MEMALLOC and calls generic_make_request
to submit the swapping bio to the block driver. The device mapper driver
uses mempools for all its I/O processing.
Prior to the patch f9054c70d28bc214b2857cf8db8269f4f45a5e23, mempool_alloc
never exhausted the reserved memory - it tried to allocace first with
__GFP_NOMEMALLOC (thus preventing the allocator from allocating below the
limits), then it tried to allocate from the mempool reserve and if the
mempool is exhausted, it waits until some structures are returned to the
mempool.
After the patch f9054c70d28bc214b2857cf8db8269f4f45a5e23, __GFP_NOMEMALLOC
is not used if the mempool is exhausted - and so repeated use of
mempool_alloc (tohether with PF_MEMALLOC that is implicitly set) can
exhaust all available memory.
The patch f9054c70d28bc214b2857cf8db8269f4f45a5e23 allows more paralellism
(mempool_alloc waits less and proceeds more often), but the downside is
that it exhausts all the memory. Bisection showed that those dm-crypt
swapping failures were caused by that patch.
I think f9054c70d28bc214b2857cf8db8269f4f45a5e23 should be reverted - but
first, we need to find out why does swapping fail if all the memory is
exhausted - that is a separate bug that should be addressed first.
> > but we can allow mempool_alloc(__GFP_NOMEMALLOC) requests to access
> > memory reserves via below change, can't we?
There are no mempool_alloc(__GFP_NOMEMALLOC) requsts - mempool users don't
use this flag.
Mikulas
Powered by blists - more mailing lists