linux-kernel - Re: Compaction & folios

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <0cfb2279-8cfa-6be9-d732-1fc6b1fb6dc1@suse.cz>
Date:   Thu, 7 Oct 2021 12:06:08 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Kent Overstreet <kent.overstreet@...il.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Cc:     hannes@...xchg.org, willy@...radead.org, rientjes@...gle.com,
        Mel Gorman <mgorman@...hsingularity.net>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: Compaction & folios

On 10/7/21 00:53, Kent Overstreet wrote:
> So I have some observations on memory compaction & hugepages.
> 
> Right now, the working assumption in MM is that compaction is hard and
> expensive, and right now it is - because most allocations are order 0, with a
> small subset being hugepage order allocations. This means any time we need a
> hugepage, compaction has to move a bunch of order 0 pages around, and memory
> reclaim is no help here - when we reclaim memory, it's coming back as fragmented
> order 0 pages.
> 
> But what if compaction wasn't such a difficult, expensive operation?
> 
> With folios, and then folios for anonymous pages, we won't see nearly so many
> order 0 allocations anymore - we'll see a spread of allocation sizes based on a
> mixture of application usage patterns - something much closer to a poisson
> distribution, vs. our current very bimodal distribution. And since we won't be
> fragmenting all our allocations up front, memory reclaim will be freeing
> allocations in this same distribution.

Unfortunately, the main problem with compaction is not the act of moving a
number of LRU pages, but rather the presence of unmovable pages (slab, page
tables and whatnot kernel allocations), where such a single page makes the
whole 2MB block unusable. So I don't expect this would help dramatically for
compaction, but the points added by Matthew would still apply.

> Which means that any time an order n allocation fails, it's likely that we'll
> still have order n-1 pages free - and of those free order n-1 pages, one will
> likely have a buddy that's moveable and hasn't been fragmented - meaning the
> common case is that compaction will have to move _one_ (higher order) page -
> we'll almost never be having to move a bunch of 4k pages.
> 
> Another way of thinking of this is that memory reclaim will be doing most of the
> work that compaction has to do now to allocate a high order page. Compaction
> will go from an expensive, somewhat unreliable operation to one that mostly just
> works - it's going to be _much_ less of a pain point.
> 
> It may turn out that allocating hugepages still doesn't work as reliably as we'd
> like - but folios are still a big help even when we can't allocate a 2MB page,
> because we'll be able to fall back to an order 6 or 7 or 8 allocation, which is
> something we can't do now. And, since multiple CPU vendors now support
> coalescing contiguous PTE entries in the TLB, this will still get us most of the
> performance benefits of using hugepages.
>