linux-kernel - Re: [PATCH 1/2] mm, oom: Give __GFP_NOFAIL allocations access to memory reserves

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151125111801.GD27283@dhcp22.suse.cz>
Date:	Wed, 25 Nov 2015 12:18:01 +0100
From:	Michal Hocko <mhocko@...nel.org>
To:	David Rientjes <rientjes@...gle.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mgorman@...e.de>,
	Johannes Weiner <hannes@...xchg.org>, linux-mm@...ck.org,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] mm, oom: Give __GFP_NOFAIL allocations access to
 memory reserves

On Wed 25-11-15 02:51:38, David Rientjes wrote:
> On Wed, 25 Nov 2015, Michal Hocko wrote:
> 
> > From: Michal Hocko <mhocko@...e.com>
> > 
> > __GFP_NOFAIL is a big hammer used to ensure that the allocation
> > request can never fail. This is a strong requirement and as such
> > it also deserves a special treatment when the system is OOM. The
> > primary problem here is that the allocation request might have
> > come with some locks held and the oom victim might be blocked
> > on the same locks. This is basically an OOM deadlock situation.
> > 
> > This patch tries to reduce the risk of such a deadlocks by giving
> > __GFP_NOFAIL allocations a special treatment and let them dive into
> > memory reserves after oom killer invocation. This should help them
> > to make a progress and release resources they are holding. The OOM
> > victim should compensate for the reserves consumption.
> > 
> > Suggested-by: Andrea Arcangeli <aarcange@...hat.com>
> > Signed-off-by: Michal Hocko <mhocko@...e.com>
> > ---
> >  mm/page_alloc.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 8034909faad2..70db11c27046 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -2766,8 +2766,13 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
> >  			goto out;
> >  	}
> >  	/* Exhausted what can be done so it's blamo time */
> > -	if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL))
> > +	if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) {
> >  		*did_some_progress = 1;
> > +
> > +		if (gfp_mask & __GFP_NOFAIL)
> > +			page = get_page_from_freelist(gfp_mask, order,
> > +					ALLOC_NO_WATERMARKS|ALLOC_CPUSET, ac);
> > +	}
> >  out:
> >  	mutex_unlock(&oom_lock);
> >  	return page;
> 
> I don't understand why you're setting ALLOC_CPUSET if you're giving them 
> "special treatment".  If you want to allow access to memory reserves to 
> prevent an oom livelock, then why not also allow it access to allocate 
> outside its cpuset?

Good question. My thinking was that __GFP_NOFAIL allocations might be
done on behalf on a process so they are not necessarily system wide. We
do the same before we actually go to out_of_memory. On the other hand
__GFP_NOFAIL should be used really rarely and so breaking the cpuset
restriction shouldn't be a big deal if that helps to break out from the
potential OOM deadlock. I will drop it.

Thanks!
---
>From d89d17e72e5e1c03539f8c81fc6e120bccd2b460 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@...e.com>
Date: Tue, 23 Jun 2015 09:15:00 +0200
Subject: [PATCH] mm, oom: Give __GFP_NOFAIL allocations access to memory
 reserves

__GFP_NOFAIL is a big hammer used to ensure that the allocation
request can never fail. This is a strong requirement and as such
it also deserves a special treatment when the system is OOM. The
primary problem here is that the allocation request might have
come with some locks held and the oom victim might be blocked
on the same locks. This is basically an OOM deadlock situation.

This patch tries to reduce the risk of such a deadlocks by giving
__GFP_NOFAIL allocations a special treatment and let them dive into
memory reserves after oom killer invocation. This should help them
to make a progress and release resources they are holding. The OOM
victim should compensate for the reserves consumption.

[rientjes@...gle.com: do not use ALLOC_CPUSET]
Suggested-by: Andrea Arcangeli <aarcange@...hat.com>
Signed-off-by: Michal Hocko <mhocko@...e.com>
---
 mm/page_alloc.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8034909faad2..94b04c1e894a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2766,8 +2766,13 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 			goto out;
 	}
 	/* Exhausted what can be done so it's blamo time */
-	if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL))
+	if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) {
 		*did_some_progress = 1;
+
+		if (gfp_mask & __GFP_NOFAIL)
+			page = get_page_from_freelist(gfp_mask, order,
+					ALLOC_NO_WATERMARKS, ac);
+	}
 out:
 	mutex_unlock(&oom_lock);
 	return page;
-- 
2.6.2


-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/