[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2ace6fc2-6891-4d6c-98de-c027da03d516@kernel.org>
Date: Thu, 8 Jan 2026 00:08:35 +0100
From: "David Hildenbrand (Red Hat)" <david@...nel.org>
To: Kiryl Shutsemau <kas@...nel.org>, Matthew Wilcox <willy@...radead.org>
Cc: Muchun Song <muchun.song@...ux.dev>, Oscar Salvador <osalvador@...e.de>,
Mike Rapoport <rppt@...nel.org>, Vlastimil Babka <vbabka@...e.cz>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Zi Yan <ziy@...dia.com>,
Baoquan He <bhe@...hat.com>, Michal Hocko <mhocko@...e.com>,
Johannes Weiner <hannes@...xchg.org>, Jonathan Corbet <corbet@....net>,
kernel-team@...a.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
Usama Arif <usamaarif642@...il.com>, Frank van der Linden <fvdl@...gle.com>
Subject: Re: [PATCHv2 02/14] mm/sparse: Check memmap alignment
>> "Then we make page->compound_head point to the dynamically allocated memdesc
>> rather than the first page. Then we can transition to the above layout. "
>
Sorry for the late reply, it's been a bit crazy over here.
> I am not sure I understand how it is going to work.
>
I don't recall all the details that Willy shared over the last years
while working on folios, but I will try to answer as best as I can from
the top of my head. (there are plenty of resources on the list, on the
web, in his presentations etc.).
> 32-byte layout indicates that flags will stay in the statically
> allocated part, but most (all?) flags are in the head page and we would
> need a way to redirect from tail to head in the statically allocated
> pages.
When working with folios we will never go through the head page flags.
That's why Willy has incrementally converted most folio code that worked
on pages to work on folios.
For example, PageUptodate() does a
folio_test_uptodate(page_folio(page));
The flags in the 32-byte layout will be used by some non-folio things
for which we won't allocate memdescs (just yet) (e.g., free pages in the
buddy and other things that does not require a lot of metadata). Some of
these flags will be moved into the memdesc pointer in the future as the
conversion proceeeds.
>
>> The "memdesc" could be a pointer to a "struct folio" that is allocated from
>> the slab.
>>
>> So in the new memdesc world, all pages part of a folio will point at the
>> allocated "struct folio", not the head page where "struct folio" currently
>> overlays "struct page".
>>
>> That would mean that the proposal in this patch set will have to be reverted
>> again.
>>
>>
>> At LPC, Willy said that he wants to have something out there in the first
>> half of 2026.
>
> Okay, seems ambitious to me.
When the program was called "2025" I considered it very ambitious :) Now
I consider it ambitious. I think Willy already shared early versions of
the "struct slab" split and the "struct ptdesc" split recently on the list.
>
> Last time I asked, we had no idea how much performance would additional
> indirection cost us. Do we have a clue?
I raised that in the past, and I think the answer I got was that
(a) We always had these indirection cost when going from tail page to
head page / folio.
(b) We must convert the code to do as little page_folio() as possible.
That's why we saw so much code conversion to stop working on pages
and only work on folios.
There are certainly cases where we cannot currently avoid the
indirection, like when we traverse a page table and go
pfn -> page -> folio
and cannot simply go
pfn -> folio
On the bright side, we'll lose the head-page checks and can simply
dereference the pointer.
I don't know whether Willy has more information yet, but I would assume
that in most cases this will be similar to the performance summary in
your cover letter: "... has shown either no change or only a slight
improvement within the noise.", just that it will be "only a slight
degradation within the noise". :)
We'll learn I guess, in particular which other page -> folio conversions
cannot be optimized out by caching the folio.
For quite some time there will be a magical config option that will
switch between both layouts. I'd assume that things will get more
complicated if we suddenly have a "compound_head/folio" pointer and a
"compound_info" pointer at the same time.
But it's really Willy who has the concept in mind as he is very likely
right now busy writing some of that code.
I'm just the messenger.
:)
[I would hope that Willy could share his thoughts]
--
Cheers
David
Powered by blists - more mailing lists