linux-ext4 - Re: 4.7.0, cp -al causes OOM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160814125111.GE9248@dhcp22.suse.cz>
Date:	Sun, 14 Aug 2016 14:51:12 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	arekm@...en.pl
Cc:	linux-ext4@...r.kernel.org, linux-mm@...r.kernel.org
Subject: Re: 4.7.0, cp -al causes OOM

On Fri 12-08-16 09:43:40, Michal Hocko wrote:
> Hi,
> 
> On Fri 12-08-16 09:01:41, Arkadiusz Miskiewicz wrote:
[...]
> > [87259.568395] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15360kB
> > [87259.568403] Node 0 DMA32: 11467*4kB (UME) 1525*8kB (UME) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 58068kB
> > [87259.568411] Node 0 Normal: 9927*4kB (UMEH) 1119*8kB (UMH) 19*16kB (H) 8*32kB (H) 2*64kB (H) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 49348kB
> 
> As you can see there are barely some high order pages available. There
> are few in the atomic reserves which is a bit surprising because I would
> expect they would get released under a heavy memory pressure. I will
> double check that part.

OK, so the reason is that we are trying to preserve at least one page
block per zone. This is not really all that much to matter overall but I
guess we should just release those pageblocks because OOM is certainly
much worse than an high order GFP_ATOMIC request failing. The diff below
does that. I am a bit skeptical this will make much difference but let's
give it a try. I will also send another patch which should show
compaction/migration counters during high order OOMs. This might tell us
a bit more about the compaction behavior.
---
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9d46b65061be..b8600943184e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2053,8 +2053,7 @@ static void unreserve_highatomic_pageblock(const struct alloc_context *ac)
 
 	for_each_zone_zonelist_nodemask(zone, z, zonelist, ac->high_zoneidx,
 								ac->nodemask) {
-		/* Preserve at least one pageblock */
-		if (zone->nr_reserved_highatomic <= pageblock_nr_pages)
+		if (!zone->nr_reserved_highatomic)
 			continue;
 
 		spin_lock_irqsave(&zone->lock, flags);
@@ -3276,11 +3275,10 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
 
 	/*
 	 * If an allocation failed after direct reclaim, it could be because
-	 * pages are pinned on the per-cpu lists or in high alloc reserves.
+	 * pages are pinned on the per-cpu lists.
 	 * Shrink them them and try again
 	 */
 	if (!page && !drained) {
-		unreserve_highatomic_pageblock(ac);
 		drain_all_pages(NULL);
 		drained = true;
 		goto retry;
@@ -3636,6 +3634,12 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 		goto retry;
 
 	/*
+	 * Make sure we are not pinning atomic higher order reserves when we
+	 * are really fighting to get !costly order and running out of memory
+	 */
+	unreserve_highatomic_pageblock(ac);
+
+	/*
 	 * It doesn't make any sense to retry for the compaction if the order-0
 	 * reclaim is not able to make any progress because the current
 	 * implementation of the compaction depends on the sufficient amount
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html