linux-kernel - Re: [PATCH 9/9] mm: page_alloc: memory reserve access for OOM-killing allocations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 28 Apr 2015 15:30:09 +0200
From:	Michal Hocko <mhocko@...e.cz>
To:	Johannes Weiner <hannes@...xchg.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Dave Chinner <david@...morbit.com>,
	David Rientjes <rientjes@...gle.com>,
	Vlastimil Babka <vbabka@...e.cz>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 9/9] mm: page_alloc: memory reserve access for
 OOM-killing allocations

On Mon 27-04-15 15:05:55, Johannes Weiner wrote:
> The OOM killer connects random tasks in the system with unknown
> dependencies between them, and the OOM victim might well get blocked
> behind locks held by the allocating task.  That means that while
> allocations can issue OOM kills to improve the low memory situation,
> which generally frees more than they are going to take out, they can
> not rely on their *own* OOM kills to make forward progress.
> 
> However, OOM-killing allocations currently retry forever.  Without any
> extra measures the above situation will result in a deadlock; between
> the allocating task and the OOM victim at first, but it can spread
> once other tasks in the system start contending for the same locks.
> 
> Allow OOM-killing allocations to dip into the system's memory reserves
> to avoid this deadlock scenario.  Those reserves are specifically for
> operations in the memory reclaim paths which need a small amount of
> memory to release a much larger amount.  Arguably, the same notion
> applies to the OOM killer.

This will not work without some throttling. You will basically give a
free ticket to all memory reserves to basically all allocating tasks
(which are allowed to trigger OOM and there might be hundreds of them)
and that itself might prevent the OOM victim from exiting.

Your previous OOM wmark was nicer because it naturally throttled
allocations and still left some room for the exiting task.

> Signed-off-by: Johannes Weiner <hannes@...xchg.org>
> ---
>  mm/page_alloc.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 94530db..5f3806d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2384,6 +2384,20 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, int alloc_flags,
>  		if (WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL))
>  			*did_some_progress = 1;
>  	}
> +
> +	/*
> +	 * In the current implementation, an OOM-killing allocation
> +	 * loops indefinitely inside the allocator.  However, it's
> +	 * possible for the OOM victim to get stuck behind locks held
> +	 * by the allocating task itself, so we can never rely on the
> +	 * OOM killer to free memory synchroneously without risking a
> +	 * deadlock.  Allow these allocations to dip into the memory
> +	 * reserves to ensure forward progress once the OOM kill has
> +	 * been issued.  The reserves will be replenished when the
> +	 * caller releases the locks and the victim exits.
> +	 */
> +	if (*did_some_progress)
> +		alloc_flags |= ALLOC_NO_WATERMARKS;
>  out:
>  	mutex_unlock(&oom_lock);
>  alloc:
> -- 
> 2.3.4
> 

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/