lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160718151445.GB14604@cmpxchg.org>
Date:	Mon, 18 Jul 2016 11:14:45 -0400
From:	Johannes Weiner <hannes@...xchg.org>
To:	Mikulas Patocka <mpatocka@...hat.com>
Cc:	David Rientjes <rientjes@...gle.com>,
	Michal Hocko <mhocko@...nel.org>,
	Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
	Ondrej Kozina <okozina@...hat.com>,
	Jerome Marchand <jmarchan@...hat.com>,
	Stanislav Kozina <skozina@...hat.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, dm-devel@...hat.com,
	Dave Chinner <david@...morbit.com>
Subject: Re: System freezes after OOM

CC Dave Chinner, who I recall had strong opinions on the mempool model

The context is commit f9054c7 ("mm, mempool: only set __GFP_NOMEMALLOC
if there are free elements"), which gives MEMALLOC/TIF_MEMDIE mempool
allocations access to the system emergency reserves when there is no
reserved object currently residing in the mempool.

On Fri, Jul 15, 2016 at 07:21:59AM -0400, Mikulas Patocka wrote:
> On Thu, 14 Jul 2016, David Rientjes wrote:
> 
> > There is no guarantee that _anything_ can return memory to the mempool,
> 
> You misunderstand mempools if you make such claims.

Uhm, fully agreed.

The point of mempools is that they have their own reserves, separate
from the system reserves, to make forward progress in OOM situations.

All mempool object holders promise to make forward progress, and when
memory is depleted, the mempool allocations serialize against each
other. In this case, every allocation has to wait for in-flight IO to
finish to pass the reserved object on to the next IO. That's how the
mempool model is designed. The commit in question breaks this by not
waiting for outstanding object holders and instead quickly depletes
the system reserves. That's a mempool causing a memory deadlock...

David observed systems hanging 2+h inside mempool allocations. But
where would an object holders get stuck? It can't be taking a lock
that the waiting mempool_alloc() is holding, obviously. It also can't
be waiting for another allocation, it makes no sense to use mempools
to guarantee forward progress, but then have the whole sequence rely
on an unguaranteed allocation to succeed after the mempool ones. So
how could a system-wide OOM situation cause a mempool holder to hang?

These hangs are fishy, but it seems reasonable to assume that somebody
is breaking the mempool contract somewhere. The solution can not to be
to abandon the mempool model. f9054c7 should be reverted.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ