linux-kernel - Re: [PATCH 0/3] OOM detection rework v4

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160204133905.GB14425@dhcp22.suse.cz>
Date:	Thu, 4 Feb 2016 14:39:06 +0100
From:	Michal Hocko <mhocko@...nel.org>
To:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:	rientjes@...gle.com, akpm@...ux-foundation.org,
	torvalds@...ux-foundation.org, hannes@...xchg.org, mgorman@...e.de,
	hillf.zj@...baba-inc.com, kamezawa.hiroyu@...fujitsu.com,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/3] OOM detection rework v4

On Thu 04-02-16 22:10:54, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > I am not sure we can fix these pathological loads where we hit the
> > higher order depletion and there is a chance that one of the thousands
> > tasks terminates in an unpredictable way which happens to race with the
> > OOM killer.
> 
> When I hit this problem on Dec 24th, I didn't run thousands of tasks.
> I think there were less than one hundred tasks in the system and only
> a few tasks were running. Not a pathological load at all.

But as the OOM report clearly stated there were no > order-1 pages
available in that particular case. And that happened after the direct
reclaim and compaction were already invoked.

As I've mentioned in the referenced email, we can try to do multiple
retries e.g. do not give up on the higher order requests until we hit
the maximum number of retries but I consider it quite ugly to be honest.
I think that a proper communication with compaction is a more
appropriate way to go long term. E.g. I find it interesting that
try_to_compact_pages doesn't even care about PAGE_ALLOC_COSTLY_ORDER
and treat is as any other high order request.

Something like the following:
---
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 269a04f20927..1ae5b7da821b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3106,6 +3106,18 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order,
 		}
 	}
 
+	/*
+	 * OK, so the watermak check has failed. Make sure we do all the
+	 * retries for !costly high order requests and hope that multiple
+	 * runs of compaction will generate some high order ones for us.
+	 *
+	 * XXX: ideally we should teach the compaction to try _really_ hard
+	 * if we are in the retry path - something like priority 0 for the
+	 * reclaim
+	 */
+	if (order <= PAGE_ALLOC_COSTLY_ORDER)
+		return true;
+
 	return false;
 }
 
@@ -3281,11 +3293,11 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 		goto noretry;
 
 	/*
-	 * Costly allocations might have made a progress but this doesn't mean
-	 * their order will become available due to high fragmentation so do
-	 * not reset the no progress counter for them
+	 * High order allocations might have made a progress but this doesn't
+	 * mean their order will become available due to high fragmentation so
+	 * do not reset the no progress counter for them
 	 */
-	if (did_some_progress && order <= PAGE_ALLOC_COSTLY_ORDER)
+	if (did_some_progress && !order)
 		no_progress_loops = 0;
 	else
 		no_progress_loops++;
-- 
Michal Hocko
SUSE Labs