lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.02.1607131004340.31769@file01.intranet.prod.int.rdu2.redhat.com>
Date:	Wed, 13 Jul 2016 10:18:35 -0400 (EDT)
From:	Mikulas Patocka <mpatocka@...hat.com>
To:	Michal Hocko <mhocko@...nel.org>
cc:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
	Ondrej Kozina <okozina@...hat.com>,
	Jerome Marchand <jmarchan@...hat.com>,
	Stanislav Kozina <skozina@...hat.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, David Rientjes <rientjes@...gle.com>
Subject: Re: System freezes after OOM



On Wed, 13 Jul 2016, Michal Hocko wrote:

> [CC David]
> 
> > > It is caused by the commit f9054c70d28bc214b2857cf8db8269f4f45a5e23. 
> > > Prior to this commit, mempool allocations set __GFP_NOMEMALLOC, so 
> > > they never exhausted reserved memory. With this commit, mempool 
> > > allocations drop __GFP_NOMEMALLOC, so they can dig deeper (if the 
> > > process has PF_MEMALLOC, they can bypass all limits).
> > 
> > I wonder whether commit f9054c70d28bc214 ("mm, mempool: only set 
> > __GFP_NOMEMALLOC if there are free elements") is doing correct thing. 
> > It says
> > 
> >     If an oom killed thread calls mempool_alloc(), it is possible that 
> > it'll
> >     loop forever if there are no elements on the freelist since
> >     __GFP_NOMEMALLOC prevents it from accessing needed memory reserves in
> >     oom conditions.
> 
> I haven't studied the patch very deeply so I might be missing something
> but from a quick look the patch does exactly what the above says.
> 
> mempool_alloc used to inhibit ALLOC_NO_WATERMARKS by default. David has
> only changed that to allow ALLOC_NO_WATERMARKS if there are no objects
> in the pool and so we have no fallback for the default __GFP_NORETRY
> request.

The swapper core sets the flag PF_MEMALLOC and calls generic_make_request 
to submit the swapping bio to the block driver. The device mapper driver 
uses mempools for all its I/O processing.

Prior to the patch f9054c70d28bc214b2857cf8db8269f4f45a5e23, mempool_alloc 
never exhausted the reserved memory - it tried to allocace first with 
__GFP_NOMEMALLOC (thus preventing the allocator from allocating below the 
limits), then it tried to allocate from the mempool reserve and if the 
mempool is exhausted, it waits until some structures are returned to the 
mempool.

After the patch f9054c70d28bc214b2857cf8db8269f4f45a5e23, __GFP_NOMEMALLOC 
is not used if the mempool is exhausted - and so repeated use of 
mempool_alloc (tohether with PF_MEMALLOC that is implicitly set) can 
exhaust all available memory.

The patch f9054c70d28bc214b2857cf8db8269f4f45a5e23 allows more paralellism 
(mempool_alloc waits less and proceeds more often), but the downside is 
that it exhausts all the memory. Bisection showed that those dm-crypt 
swapping failures were caused by that patch.

I think f9054c70d28bc214b2857cf8db8269f4f45a5e23 should be reverted - but 
first, we need to find out why does swapping fail if all the memory is 
exhausted - that is a separate bug that should be addressed first.

> > but we can allow mempool_alloc(__GFP_NOMEMALLOC) requests to access
> > memory reserves via below change, can't we?

There are no mempool_alloc(__GFP_NOMEMALLOC) requsts - mempool users don't 
use this flag.

Mikulas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ