linux-kernel - Re: [PATCH 09/11] mm, compaction: Abstract compaction feedback to helpers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160411151410.GL23157@dhcp22.suse.cz>
Date:	Mon, 11 Apr 2016 17:14:10 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	Vlastimil Babka <vbabka@...e.cz>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Mel Gorman <mgorman@...e.de>,
	David Rientjes <rientjes@...gle.com>,
	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
	Joonsoo Kim <js1304@...il.com>,
	Hillf Danton <hillf.zj@...baba-inc.com>, linux-mm@...ck.org,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 09/11] mm, compaction: Abstract compaction feedback to
 helpers

On Mon 11-04-16 16:39:21, Vlastimil Babka wrote:
> On 04/05/2016 01:25 PM, Michal Hocko wrote:
[...]
> >+/* Compaction has failed and it doesn't make much sense to keep retrying. */
> >+static inline bool compaction_failed(enum compact_result result)
> >+{
> >+	/* All zones where scanned completely and still not result. */
> 
> Hmm given that try_to_compact_pages() uses a max() on results, then in fact
> it takes only one zone to get this. Others could have been also SKIPPED or
> DEFERRED. Is that what you want?

In short I didn't find any better way and still guarantee a some
guarantee of convergence. COMPACT_COMPLETE means that at least one zone
was completely scanned and led to no result. That zone would be
compact_suitable by definition. If I made DEFERRED or SKIPPED more
priorite (aka higher in the enum) then I could easily end up in a state
where all zones would return COMPACT_COMPLETE and few remaining would
just alternate returning their DEFFERED resp. SKIPPED. So while this
might sound like giving up too early I couldn't come up with anything
more specific that would lead to reliable results.

I am open to any suggestions of course.

[...]
> >--- a/mm/page_alloc.c
> >+++ b/mm/page_alloc.c
> >@@ -3362,25 +3362,12 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> >  	if (page)
> >  		goto got_pg;
> >
> >-	/* Checks for THP-specific high-order allocations */
> >-	if (is_thp_gfp_mask(gfp_mask)) {
> >-		/*
> >-		 * If compaction is deferred for high-order allocations, it is
> >-		 * because sync compaction recently failed. If this is the case
> >-		 * and the caller requested a THP allocation, we do not want
> >-		 * to heavily disrupt the system, so we fail the allocation
> >-		 * instead of entering direct reclaim.
> >-		 */
> >-		if (compact_result == COMPACT_DEFERRED)
> >-			goto nopage;
> >-
> >-		/*
> >-		 * Compaction is contended so rather back off than cause
> >-		 * excessive stalls.
> >-		 */
> >-		if(compact_result == COMPACT_CONTENDED)
> >-			goto nopage;
> >-	}
> >+	/*
> >+	 * Checks for THP-specific high-order allocations and back off
> >+	 * if the the compaction backed off
> >+	 */
> >+	if (is_thp_gfp_mask(gfp_mask) && compaction_withdrawn(compact_result))
> >+		goto nopage;
> 
> The change of semantics for THP is not trivial here and should at least be
> discussed in changelog. CONTENDED and DEFERRED is only subset of
> compaction_withdrawn() as seen above.

True. My main motivation was to get rid of the compaction specific code
from the allocator path as much as possible. I can drop the above hunk
of course but I think we should get rid of these checks and make the
code simpler. To be honest I am not even sure those changes are really
measurable.

> Why is it useful to back off due to
> COMPACT_PARTIAL_SKIPPED (we were just unlucky in our starting position), but
> not due to COMPACT_COMPLETE (we have seen the whole zone but failed anyway)?

OK, that is a good remark. I could change that to:
	if (is_thp_gfp_mask(gfp_mask) &&
		(compaction_withdrawn(compact_result) || compaction_failed(compact_result))

> Why back off due to COMPACT_SKIPPED (not enough order-0 pages) without
> trying reclaim at least once, and then another async compaction, like
> before?

The idea was that COMPACT_SKIPPED wouldn't change after a single reclaim
round most of the time because a zone would have to get above
low_wmark + 1<<9 pages.  So the only situation where it would matter would be if
we had some order-9 pages available hidden by the min wmark and we would
reclaim enough to get above the above gap. I am not sure this is what we
really want in the first place. Increase the reclaim stalls when we are
getting under memory pressure.

Thanks!
-- 
Michal Hocko
SUSE Labs