linux-kernel - Re: [PATCH v3 12/17] mm, compaction: more reliably increase direct compaction priority

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b69264a4-030b-d7e4-7ae4-00092a012129@suse.cz>
Date:	Tue, 19 Jul 2016 09:42:39 +0200
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Joonsoo Kim <iamjoonsoo.kim@....com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	Michal Hocko <mhocko@...nel.org>,
	Mel Gorman <mgorman@...hsingularity.net>,
	David Rientjes <rientjes@...gle.com>,
	Rik van Riel <riel@...hat.com>
Subject: Re: [PATCH v3 12/17] mm, compaction: more reliably increase direct
 compaction priority

On 07/19/2016 06:53 AM, Joonsoo Kim wrote:
> On Mon, Jul 18, 2016 at 02:21:02PM +0200, Vlastimil Babka wrote:
>> On 07/18/2016 06:41 AM, Joonsoo Kim wrote:
>>> On Fri, Jul 15, 2016 at 03:37:52PM +0200, Vlastimil Babka wrote:
>>>> On 07/06/2016 07:39 AM, Joonsoo Kim wrote:
>>>>> On Fri, Jun 24, 2016 at 11:54:32AM +0200, Vlastimil Babka
>>>>> wrote:
>>>>>> During reclaim/compaction loop, compaction priority can be
>>>>>> increased by the should_compact_retry() function, but the
>>>>>> current code is not optimal. Priority is only increased
>>>>>> when compaction_failed() is true, which means that
>>>>>> compaction has scanned the whole zone. This may not happen
>>>>>> even after multiple attempts with the lower priority due to
>>>>>> parallel activity, so we might needlessly struggle on the
>>>>>> lower priority and possibly run out of compaction retry 
>>>>>> attempts in the process.
>>>>>> 
>>>>>> We can remove these corner cases by increasing compaction
>>>>>> priority regardless of compaction_failed(). Examining
>>>>>> further the compaction result can be postponed only after
>>>>>> reaching the highest priority. This is a simple solution 
>>>>>> and we don't need to worry about reaching the highest
>>>>>> priority "too soon" here, because hen
>>>>>> should_compact_retry() is called it means that the system
>>>>>> is already struggling and the allocation is supposed to
>>>>>> either try as hard as possible, or it cannot fail at all.
>>>>>> There's not much point staying at lower priorities with
>>>>>> heuristics that may result in only partial compaction. Also
>>>>>> we now count compaction retries only after reaching the
>>>>>> highest priority.
>>>>> 
>>>>> I'm not sure that this patch is safe. Deferring and skip-bit
>>>>> in compaction is highly related to reclaim/compaction. Just
>>>>> ignoring them and (almost) unconditionally increasing
>>>>> compaction priority will result in less reclaim and less
>>>>> success rate on compaction.
>>>> 
>>>> I don't see why less reclaim? Reclaim is always attempted
>>>> before compaction and compaction priority doesn't affect it.
>>>> And as long as reclaim wants to retry, should_compact_retry()
>>>> isn't even called, so the priority stays. I wanted to change
>>>> that in v1, but Michal suggested I shouldn't.
>>> 
>>> I assume the situation that there is no !costly highorder
>>> freepage because of fragmentation. In this case,
>>> should_reclaim_retry() would return false since watermark cannot
>>> be met due to absence of high order freepage. Now, please see
>>> should_compact_retry() with assumption that there are enough
>>> order-0 free pages. Reclaim/compaction is only retried two times
>>> (SYNC_LIGHT and SYNC_FULL) with your patchset since 
>>> compaction_withdrawn() return false with enough freepages and 
>>> !COMPACT_SKIPPED.
>>> 
>>> But, before your patchset, COMPACT_PARTIAL_SKIPPED and 
>>> COMPACT_DEFERRED is considered as withdrawn so will retry 
>>> reclaim/compaction more times.
>> 
>> Perhaps, but it wouldn't guarantee to reach the highest priority.
> 
> Yes.

Since this is my greatest concern here, would the alternative patch at
the end of the mail work for you? Trying your test would be nice too,
but can also wait until I repost whole series (the missed watermark
checks you spotted in patch 13 could also play a role there).

> 
>> order-3 allocation just to avoid OOM, ignoring that the system
>> might be thrashing heavily? Previously it also wasn't guaranteed
>> to reclaim everything, but what is the optimal number of retries?
> 
> So, you say the similar logic in other thread we talked yesterday. 
> The fact that it wasn't guaranteed to reclaim every thing before 
> doesn't mean that we could relax guarantee more.
> 
> I'm not sure below is relevant to this series but just note.
> 
> I don't know the optimal number of retries. We are in a way to find 
> it and I hope this discussion would help. I don't think that we can 
> judge the point properly with simple checking on stat information at
> some moment. It only has too limited knowledge about the system so it
> would wrongly advise us to invoke OOM prematurely.
> 
> I think that using compaction result isn't a good way to determine
> if further reclaim/compaction is useless or not because compaction
> result can vary with further reclaim/compaction itself.

If we scan whole zone ignoring all the heuristics, and still fail, I
think it's pretty reliable (ignoring parallel activity, because then we
can indeed never be sure).

> If we want to check more accurately if compaction is really
> impossible, scanning whole range and checking arrangement of freepage
> and lru(movable) pages would more help.

But the whole zone compaction just did exactly this and failed? Sure, we
might have missed something due to the way compaction scanners meet
around the middle of zone, but that's a reason to improve the algorithm,
not to attempt more reclaim based on checks that duplicate the scanning
work.

> Although there is some possibility to fail the compaction even if 
> this check is passed, it would give us more information about the 
> system state and we would invoke OOM less prematurely. In this case 
> that theoretically compaction success is possible, we could keep 
> reclaim/compaction more times even if full compaction fails because 
> we have a hope that more freepages would give us more compaction 
> success probability.

They can only give us more probability because of a) more resilience
against parallel memory allocations getting us below low order-0
watermark during our compaction and b) we increase chances of migrate
scanner reaching higher pfn in the zone, if there are unmovable
fragmentations in the lower pfns. Both are problems to potentially
solve, and I think further tuning the decisions for reclaim/compaction
retry is just a bad workaround, and definitely not something I would
like to do in this series. So I'll try to avoid decreasing number of
retries in the patch below, but not more:

-----8<-----
>From a942ff54f7aeb2cb9cca9b868b3dde6cac90e924 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@...e.cz>
Date: Tue, 19 Jul 2016 09:26:06 +0200
Subject: [PATCH] mm, compaction: more reliably increase direct compaction
 priority

During reclaim/compaction loop, compaction priority can be increased by the
should_compact_retry() function, but the current code is not optimal. Priority
is only increased when compaction_failed() is true, which means that compaction
has scanned the whole zone. This may not happen even after multiple attempts
with the lower priority due to parallel activity, so we might needlessly
struggle on the lower priority and possibly run out of compaction retry
attempts in the process.

After this patch we are guaranteed at least one attempt at the highest
compaction priority even if we exhaust all retries at the lower priorities.

Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
---
 mm/page_alloc.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bb9b4fb66e85..aa2580a1bcf9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3155,13 +3155,8 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 	 * so it doesn't really make much sense to retry except when the
 	 * failure could be caused by insufficient priority
 	 */
-	if (compaction_failed(compact_result)) {
-		if (*compact_priority > MIN_COMPACT_PRIORITY) {
-			(*compact_priority)--;
-			return true;
-		}
-		return false;
-	}
+	if (compaction_failed(compact_result))
+		goto check_priority;
 
 	/*
 	 * make sure the compaction wasn't deferred or didn't bail out early
@@ -3185,6 +3180,15 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 	if (compaction_retries <= max_retries)
 		return true;
 
+	/* 
+	 * Make sure there is at least one attempt at the highest priority
+	 * if we exhausted all retries at the lower priorities
+	 */
+check_priority:
+	if (*compact_priority > MIN_COMPACT_PRIORITY) {
+		(*compact_priority)--;
+		return true;
+	}
 	return false;
 }
 #else
-- 
2.9.0