linux-kernel - Re: [GIT PULL] Memory folios for v5.15

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YSi4bZ7myEMNBtlY@cmpxchg.org>
Date:   Fri, 27 Aug 2021 06:03:25 -0400
From:   Johannes Weiner <hannes@...xchg.org>
To:     David Howells <dhowells@...hat.com>
Cc:     Matthew Wilcox <willy@...radead.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [GIT PULL] Memory folios for v5.15

On Thu, Aug 26, 2021 at 09:58:06AM +0100, David Howells wrote:
> One thing I like about Willy's folio concept is that, as long as everyone uses
> the proper accessor functions and macros, we can mostly ignore the fact that
> they're 2^N sized/aligned and they're composed of exact multiples of pages.
> What really matters are the correspondences between folio size/alignment and
> medium/IO size/alignment, so you could look on the folio as being a tool to
> disconnect the filesystem from the concept of pages.
>
> We could, in the future, in theory, allow the internal implementation of a
> folio to shift from being a page array to being a kmalloc'd page list or
> allow higher order units to be mixed in.  The main thing we have to stop
> people from doing is directly accessing the members of the struct.

In the current state of the folio patches, I agree with you. But
conceptually, folios are not disconnecting from the page beyond
PAGE_SIZE -> PAGE_SIZE * (1 << folio_order()). This is why I asked
what the intended endgame is. And I wonder if there is a bit of an
alignment issue between FS and MM people about the exact nature and
identity of this data structure.

At the current stage of conversion, folio is a more clearly delineated
API of what can be safely used from the FS for the interaction with
the page cache and memory management. And it looks still flexible to
make all sorts of changes, including how it's backed by
memory. Compared with the page, where parts of the API are for the FS,
but there are tons of members, functions, constants, and restrictions
due to the page's role inside MM core code. Things you shouldn't be
using, things you shouldn't be assuming from the fs side, but it's
hard to tell which is which, because struct page is a lot of things.

However, the MM narrative for folios is that they're an abstraction
for regular vs compound pages. This is rather generic. Conceptually,
it applies very broadly and deeply to MM core code: anonymous memory
handling, reclaim, swapping, even the slab allocator uses them. If we
follow through on this concept from the MM side - and that seems to be
the plan - it's inevitable that the folio API will grow more
MM-internal members, methods, as well as restrictions again in the
process. Except for the tail page bits, I don't see too much in struct
page that would not conceptually fit into this version of the folio.

The cache_entry idea is really just to codify and retain that
domain-specific minimalism and clarity from the filesystem side. As
well as the flexibility around how backing memory is implemented,
which I think could come in handy soon, but isn't the sole reason.