lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2ace6fc2-6891-4d6c-98de-c027da03d516@kernel.org>
Date: Thu, 8 Jan 2026 00:08:35 +0100
From: "David Hildenbrand (Red Hat)" <david@...nel.org>
To: Kiryl Shutsemau <kas@...nel.org>, Matthew Wilcox <willy@...radead.org>
Cc: Muchun Song <muchun.song@...ux.dev>, Oscar Salvador <osalvador@...e.de>,
 Mike Rapoport <rppt@...nel.org>, Vlastimil Babka <vbabka@...e.cz>,
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Zi Yan <ziy@...dia.com>,
 Baoquan He <bhe@...hat.com>, Michal Hocko <mhocko@...e.com>,
 Johannes Weiner <hannes@...xchg.org>, Jonathan Corbet <corbet@....net>,
 kernel-team@...a.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 linux-doc@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
 Usama Arif <usamaarif642@...il.com>, Frank van der Linden <fvdl@...gle.com>
Subject: Re: [PATCHv2 02/14] mm/sparse: Check memmap alignment

>> "Then we make page->compound_head point to the dynamically allocated memdesc
>> rather than the first page. Then we can transition to the above layout. "
> 

Sorry for the late reply, it's been a bit crazy over here.

> I am not sure I understand how it is going to work.
> 

I don't recall all the details that Willy shared over the last years 
while working on folios, but I will try to answer as best as I can from 
the top of my head. (there are plenty of resources on the list, on the 
web, in his presentations etc.).

> 32-byte layout indicates that flags will stay in the statically
> allocated part, but most (all?) flags are in the head page and we would
> need a way to redirect from tail to head in the statically allocated
> pages.

When working with folios we will never go through the head page flags. 
That's why Willy has incrementally converted most folio code that worked 
on pages to work on folios.

For example, PageUptodate() does a

	folio_test_uptodate(page_folio(page));

The flags in the 32-byte layout will be used by some non-folio things 
for which we won't allocate memdescs (just yet) (e.g., free pages in the 
buddy and other things that does not require a lot of metadata). Some of 
these flags will be moved into the memdesc pointer in the future as the 
conversion proceeeds.

> 
>> The "memdesc" could be a pointer to a "struct folio" that is allocated from
>> the slab.
>>
>> So in the new memdesc world, all pages part of a folio will point at the
>> allocated "struct folio", not the head page where "struct folio" currently
>> overlays "struct page".
>>
>> That would mean that the proposal in this patch set will have to be reverted
>> again.
>>
>>
>> At LPC, Willy said that he wants to have something out there in the first
>> half of 2026.
> 
> Okay, seems ambitious to me.

When the program was called "2025" I considered it very ambitious :) Now 
I consider it ambitious. I think Willy already shared early versions of 
the "struct slab" split and the "struct ptdesc" split recently on the list.

> 
> Last time I asked, we had no idea how much performance would additional
> indirection cost us. Do we have a clue?

I raised that in the past, and I think the answer I got was that

(a) We always had these indirection cost when going from tail page to
     head page / folio.
(b) We must convert the code to do as little page_folio() as possible.
     That's why we saw so much code conversion to stop working on pages
     and only work on folios.

There are certainly cases where we cannot currently avoid the 
indirection, like when we traverse a page table and go

	pfn -> page -> folio

and cannot simply go

	pfn -> folio

On the bright side, we'll lose the head-page checks and can simply 
dereference the pointer.

I don't know whether Willy has more information yet, but I would assume 
that in most cases this will be similar to the performance summary in 
your cover letter: "... has shown either no change or only a slight 
improvement within the noise.", just that it will be "only a slight 
degradation within the noise". :)

We'll learn I guess, in particular which other page -> folio conversions 
cannot be optimized out by caching the folio.


For quite some time there will be a magical config option that will 
switch between both layouts. I'd assume that things will get more 
complicated if we suddenly have a "compound_head/folio" pointer and a 
"compound_info" pointer at the same time.

But it's really Willy who has the concept in mind as he is very likely 
right now busy writing some of that code.

I'm just the messenger.

:)

[I would hope that Willy could share his thoughts]

-- 
Cheers

David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ