lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160721152631.GA30303@cmpxchg.org>
Date:	Thu, 21 Jul 2016 11:26:31 -0400
From:	Johannes Weiner <hannes@...xchg.org>
To:	Michal Hocko <mhocko@...nel.org>
Cc:	David Rientjes <rientjes@...gle.com>, linux-mm@...ck.org,
	Mikulas Patocka <mpatocka@...hat.com>,
	Ondrej Kozina <okozina@...hat.com>,
	Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
	Mel Gorman <mgorman@...e.de>, Neil Brown <neilb@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>, dm-devel@...hat.com
Subject: Re: [RFC PATCH 1/2] mempool: do not consume memory reserves from the
 reclaim path

On Thu, Jul 21, 2016 at 04:53:10PM +0200, Michal Hocko wrote:
> From: Michal Hocko <mhocko@...e.com>
> Date: Thu, 21 Jul 2016 16:40:59 +0200
> Subject: [PATCH] Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are
>  free elements"
> 
> This reverts commit f9054c70d28bc214b2857cf8db8269f4f45a5e23.
> 
> There has been a report about OOM killer invoked when swapping out to
> a dm-crypt device. The primary reason seems to be that the swapout
> out IO managed to completely deplete memory reserves. Ondrej was

-out

> able to bisect and explained the issue by pointing to f9054c70d28b
> ("mm, mempool: only set __GFP_NOMEMALLOC if there are free elements").
> 
> The reason is that the swapout path is not throttled properly because
> the md-raid layer needs to allocate from the generic_make_request path
> which means it allocates from the PF_MEMALLOC context. dm layer uses
> mempool_alloc in order to guarantee a forward progress which used to
> inhibit access to memory reserves when using page allocator. This has
> changed by f9054c70d28b ("mm, mempool: only set __GFP_NOMEMALLOC if
> there are free elements") which has dropped the __GFP_NOMEMALLOC
> protection when the memory pool is depleted.
> 
> If we are running out of memory and the only way forward to free memory
> is to perform swapout we just keep consuming memory reserves rather than
> throttling the mempool allocations and allowing the pending IO to
> complete up to a moment when the memory is depleted completely and there
> is no way forward but invoking the OOM killer. This is less than
> optimal.
> 
> The original intention of f9054c70d28b was to help with the OOM
> situations where the oom victim depends on mempool allocation to make a
> forward progress. David has mentioned the following backtrace:
> 
> schedule
> schedule_timeout
> io_schedule_timeout
> mempool_alloc
> __split_and_process_bio
> dm_request
> generic_make_request
> submit_bio
> mpage_readpages
> ext4_readpages
> __do_page_cache_readahead
> ra_submit
> filemap_fault
> handle_mm_fault
> __do_page_fault
> do_page_fault
> page_fault
> 
> We do not know more about why the mempool is depleted without being
> replenished in time, though. In any case the dm layer shouldn't depend
> on any allocations outside of the dedicated pools so a forward progress
> should be guaranteed. If this is not the case then the dm should be
> fixed rather than papering over the problem and postponing it to later
> by accessing more memory reserves.
> 
> mempools are a mechanism to maintain dedicated memory reserves to guaratee
> forward progress. Allowing them an unbounded access to the page allocator
> memory reserves is going against the whole purpose of this mechanism.
> 
> Bisected-by: Ondrej Kozina <okozina@...hat.com>
> Signed-off-by: Michal Hocko <mhocko@...e.com>

Reviewed-by: Johannes Weiner <hannes@...xchg.org>

Thanks Michal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ