linux-kernel - Re: [PATCH] mm, compaction: direct freepage allocation for async direct compaction

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20171128153416.f7062caba47d86eb4eb15b8b@linux-foundation.org>
Date:   Tue, 28 Nov 2017 15:34:16 -0800
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Vlastimil Babka <vbabka@...e.cz>
Cc:     Johannes Weiner <hannes@...xchg.org>, Mel Gorman <mgorman@...e.de>,
        Joonsoo Kim <iamjoonsoo.kim@....com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, kernel-team@...com
Subject: Re: [PATCH] mm, compaction: direct freepage allocation for async
 direct compaction

On Wed, 22 Nov 2017 15:52:55 +0100 Vlastimil Babka <vbabka@...e.cz> wrote:

> On 11/22/2017 03:33 PM, Johannes Weiner wrote:
> > From: Vlastimil Babka <vbabka@...e.cz>
> > 
> > The goal of direct compaction is to quickly make a high-order page available
> > for the pending allocation. The free page scanner can add significant latency
> > when searching for migration targets, although to succeed the compaction, the
> > only important limit on the target free pages is that they must not come from
> > the same order-aligned block as the migrated pages.
> > 
> > This patch therefore makes direct async compaction allocate freepages directly
> > from freelists. Pages that do come from the same block (which we cannot simply
> > exclude from the freelist allocation) are put on separate list and released
> > only after migration to allow them to merge.
> > 
> > In addition to reduced stall, another advantage is that we split larger free
> > pages for migration targets only when smaller pages are depleted, while the
> > free scanner can split pages up to (order - 1) as it encouters them. However,
> > this approach likely sacrifices some of the long-term anti-fragmentation
> > features of a thorough compaction, so we limit the direct allocation approach
> > to direct async compaction.
> > 
> > For observational purposes, the patch introduces two new counters to
> > /proc/vmstat. compact_free_direct_alloc counts how many pages were allocated
> > directly without scanning, and compact_free_direct_miss counts the subset of
> > these allocations that were from the wrong range and had to be held on the
> > separate list.
> > 
> > Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
> > Signed-off-by: Johannes Weiner <hannes@...xchg.org>
> > ---
> > 
> > Hi. I'm resending this because we've been struggling with the cost of
> > compaction in our fleet, and this patch helps substantially.
> > 
> > On 128G+ machines, we have seen isolate_freepages_block() eat up 40%
> > of the CPU cycles and scanning up to a billion PFNs per minute. Not in
> > a spike, but continuously, to service higher-order allocations from
> > the network stack, fork (non-vmap stacks), THP, etc. during regular
> > operation.
> > 
> > I've been running this patch on a handful of less-affected but still
> > pretty bad machines for a week, and the results look pretty great:
> > 
> > 	http://cmpxchg.org/compactdirectalloc/compactdirectalloc.png
> 
> Thanks a lot, that's very encouraging!

Yup.

Should we proceed with this patch for now, or wait for something better
to come along?