lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 17 Feb 2017 12:39:40 -0500
From:   Johannes Weiner <hannes@...xchg.org>
To:     Vlastimil Babka <vbabka@...e.cz>
Cc:     linux-mm@...ck.org, Joonsoo Kim <iamjoonsoo.kim@....com>,
        David Rientjes <rientjes@...gle.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        linux-kernel@...r.kernel.org, kernel-team@...com
Subject: Re: [PATCH v2 07/10] mm, compaction: restrict async compaction to
 pageblocks of same migratetype

On Fri, Feb 17, 2017 at 05:32:00PM +0100, Vlastimil Babka wrote:
> On 02/14/2017 09:10 PM, Johannes Weiner wrote:
> > On Fri, Feb 10, 2017 at 06:23:40PM +0100, Vlastimil Babka wrote:
> >> The migrate scanner in async compaction is currently limited to MIGRATE_MOVABLE
> >> pageblocks. This is a heuristic intended to reduce latency, based on the
> >> assumption that non-MOVABLE pageblocks are unlikely to contain movable pages.
> >> 
> >> However, with the exception of THP's, most high-order allocations are not
> >> movable. Should the async compaction succeed, this increases the chance that
> >> the non-MOVABLE allocations will fallback to a MOVABLE pageblock, making the
> >> long-term fragmentation worse.
> >> 
> >> This patch attempts to help the situation by changing async direct compaction
> >> so that the migrate scanner only scans the pageblocks of the requested
> >> migratetype. If it's a non-MOVABLE type and there are such pageblocks that do
> >> contain movable pages, chances are that the allocation can succeed within one
> >> of such pageblocks, removing the need for a fallback. If that fails, the
> >> subsequent sync attempt will ignore this restriction.
> >> 
> >> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
> > 
> > Yes, IMO we should make the async compaction scanner decontaminate
> > unmovable blocks. This is because we fall back to other-typed blocks
> > before we reclaim,
> 
> Which we could change too, patch 9 is a step in that direction.

Yep, patch 9 looks good to me too, pending data that confirms it.

> > so any unmovable blocks that aren't perfectly
> > occupied will fill with greedy page cache (and order-0 doesn't steal
> > blocks back to make them compactable again).
> 
> order-0 allocation can actually steal the block back, the decisions to steal are
> based on the order of the free pages in the fallback block, not on the
> allocation order. But maybe I'm not sure what exactly you meant here.

No, that was me misreading the code. Scratch what's in parentheses.

> > The thing I'm not entirely certain about is the aggressiveness of this
> > patch. Instead of restricting the async scanner to blocks of the same
> > migratetype, wouldn't it be better (in terms of allocation latency) to
> > simply let it compact *all* block types?
> 
> Yes it would help allocation latency, but I'm afraid it will remove most of the
> decontamination effect.
> 
> > Maybe changing it to look at
> > unmovable blocks is enough to curb cross-contamination. Sure there
> > will still be some, but now we're matching the decontamination rate to
> > the rate of !movable higher-order allocations and don't just rely on
> > the independent cache turnover rate, which during higher-order bursts
> > might not be high enough to prevent an expansion of unmovable blocks.
> 
> The rate of compaction attempts is matched with allocations, but the probability
> of compaction scanner being in unmovable block is low when the majority of
> blocks are movable. So the decontamination rate is proportional but much smaller.

Yeah, you're right. The unmovable blocks would still expand, we'd just
turn it into a logarithmic curve.

> > Does that make sense?
> 
> I guess I can try and look at the stats, but I have doubts.

I don't insist. Your patch is implementing a good thing, we can just
keep an eye out for a change in allocation latencies before spending
time trying to mitigate a potential non-issue.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ