linux-kernel - Re: [PATCHv2 02/14] mm/sparse: Check memmap alignment

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4f82b8ef-77de-422b-a9a5-691c4eca24a3@kernel.org>
Date: Tue, 23 Dec 2025 10:38:26 +0100
From: "David Hildenbrand (Red Hat)" <david@...nel.org>
To: Muchun Song <muchun.song@...ux.dev>, Matthew Wilcox <willy@...radead.org>
Cc: Kiryl Shutsemau <kas@...nel.org>, Oscar Salvador <osalvador@...e.de>,
 Mike Rapoport <rppt@...nel.org>, Vlastimil Babka <vbabka@...e.cz>,
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Zi Yan <ziy@...dia.com>,
 Baoquan He <bhe@...hat.com>, Michal Hocko <mhocko@...e.com>,
 Johannes Weiner <hannes@...xchg.org>, Jonathan Corbet <corbet@....net>,
 kernel-team@...a.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 linux-doc@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
 Usama Arif <usamaarif642@...il.com>, Frank van der Linden <fvdl@...gle.com>
Subject: Re: [PATCHv2 02/14] mm/sparse: Check memmap alignment

On 12/22/25 15:55, Muchun Song wrote:
> 
> 
>> On Dec 22, 2025, at 22:18, David Hildenbrand (Red Hat) <david@...nel.org> wrote:
>>
>> On 12/22/25 15:02, Kiryl Shutsemau wrote:
>>>> On Mon, Dec 22, 2025 at 04:34:40PM +0800, Muchun Song wrote:
>>>>
>>>>
>>>> On 2025/12/18 23:09, Kiryl Shutsemau wrote:
>>>>> The upcoming changes in compound_head() require memmap to be naturally
>>>>> aligned to the maximum folio size.
>>>>>
>>>>> Add a warning if it is not.
>>>>>
>>>>> A warning is sufficient as MAX_FOLIO_ORDER is very rarely used, so the
>>>>> kernel is still likely to be functional if this strict check fails.
>>>>
>>>> Different architectures default to 2 MB alignment (mainly to
>>>> enable huge mappings), which only accommodates folios up to
>>>> 128 MB. Yet 1 GB huge pages are still fairly common, so
>>>> validating 16 GB (MAX_FOLIO_SIZE) alignment seems likely to
>>>> miss the most frequent case.
>>> I don't follow. 16 GB check is more strict that anything smaller.
>>> How can it miss the most frequent case?
>>>> I’m concerned that this might plant a hidden time bomb: it
>>>> could detonate at any moment in later code, silently triggering
>>>> memory corruption or similar failures. Therefore, I don’t
>>>> think a WARNING is a good choice.
>>> We can upgrade it BUG_ON(), but I want to understand your logic here
>>> first.
>>
>> Definitely no BUG_ON(). I would assume this is something we would find early during testing, so even a VM_WARN_ON_ONCE() should be good enough?
>>
>> This smells like a possible problem, though, as soon as some architecture wants to increase the folio size. What would be the expected step to ensure the alignment is done properly?
>>
>> But OTOH, as I raised Willy's work will make all of that here obsolete either way, so maybe not worth worrying about that case too much,
> 
> Hi David,
> 

Hi! :)

> I hope you're doing well. I must admit I have limited knowledge of Willy's work, and I was wondering if you might be kind enough to share any publicly available links where I could learn more about the future direction of this project. I would be truly grateful for your guidance.
> Thank you very much in advance.

There is some information to be had at [1], but more at [2]. Take a look 
at [2] in "After those projects are complete - Then we can shrink struct 
page to 32 bytes:"

In essence, all pages (belonging to a memdesc) will have a "memdesc" 
pointer (that replaces the compound_head pointer).

"Then we make page->compound_head point to the dynamically allocated 
memdesc rather than the first page. Then we can transition to the above 
layout. "

The "memdesc" could be a pointer to a "struct folio" that is allocated 
from the slab.

So in the new memdesc world, all pages part of a folio will point at the 
allocated "struct folio", not the head page where "struct folio" 
currently overlays "struct page".

That would mean that the proposal in this patch set will have to be 
reverted again.


At LPC, Willy said that he wants to have something out there in the 
first half of 2026.

[1] https://kernelnewbies.org/MatthewWilcox/Memdescs
[2] https://kernelnewbies.org/MatthewWilcox/Memdescs/Path

-- 
Cheers

David