linux-kernel - Re: [PATCH v4] page_alloc: allow migration of smaller hugepages during contig

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aTCZEcJqcgGv8Zir@gourry-fedora-PF4VCD3F>
Date: Wed, 3 Dec 2025 15:09:53 -0500
From: Gregory Price <gourry@...rry.net>
To: "David Hildenbrand (Red Hat)" <david@...nel.org>
Cc: Frank van der Linden <fvdl@...gle.com>,
	Johannes Weiner <hannes@...xchg.org>, linux-mm@...ck.org,
	kernel-team@...a.com, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org, vbabka@...e.cz, surenb@...gle.com,
	mhocko@...e.com, jackmanb@...gle.com, ziy@...dia.com,
	kas@...nel.org, dave.hansen@...ux.intel.com,
	rick.p.edgecombe@...el.com, muchun.song@...ux.dev,
	osalvador@...e.de, x86@...nel.org, linux-coco@...ts.linux.dev,
	kvm@...r.kernel.org, Wei Yang <richard.weiyang@...il.com>,
	David Rientjes <rientjes@...gle.com>,
	Joshua Hahn <joshua.hahnjy@...il.com>
Subject: Re: [PATCH v4] page_alloc: allow migration of smaller hugepages
 during contig_alloc

On Wed, Dec 03, 2025 at 08:43:29PM +0100, David Hildenbrand (Red Hat) wrote:
> On 12/3/25 19:01, Frank van der Linden wrote:
> > 
> > The PageHuge() check seems a bit out of place there, if you just
> > removed it altogether you'd get the same results, right? The isolation
> > code will deal with it. But sure, it does potentially avoid doing some
> > unnecessary work.
> 
> commit 4d73ba5fa710fe7d432e0b271e6fecd252aef66e
> Author: Mel Gorman <mgorman@...hsingularity.net>
> Date:   Fri Apr 14 15:14:29 2023 +0100
> 
>     mm: page_alloc: skip regions with hugetlbfs pages when allocating 1G pages
>     A bug was reported by Yuanxi Liu where allocating 1G pages at runtime is
>     taking an excessive amount of time for large amounts of memory.  Further
>     testing allocating huge pages that the cost is linear i.e.  if allocating
>     1G pages in batches of 10 then the time to allocate nr_hugepages from
>     10->20->30->etc increases linearly even though 10 pages are allocated at
>     each step.  Profiles indicated that much of the time is spent checking the
>     validity within already existing huge pages and then attempting a
>     migration that fails after isolating the range, draining pages and a whole
>     lot of other useless work.
>     Commit eb14d4eefdc4 ("mm,page_alloc: drop unnecessary checks from
>     pfn_range_valid_contig") removed two checks, one which ignored huge pages
>     for contiguous allocations as huge pages can sometimes migrate.  While
>     there may be value on migrating a 2M page to satisfy a 1G allocation, it's
>     potentially expensive if the 1G allocation fails and it's pointless to try
>     moving a 1G page for a new 1G allocation or scan the tail pages for valid
>     PFNs.
>     Reintroduce the PageHuge check and assume any contiguous region with
>     hugetlbfs pages is unsuitable for a new 1G allocation.
> 

Worth noting that because this check really only applies to gigantic
page *reservation* (not faulting), this isn't necessarily incurred in a
time critical path.  So, maybe i'm biased here, the reliability increase
feels like a win even if the operation can take a very long time under
memory pressure scenarios (which seems like an outliar anyway).

~Gregory