[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f6b159c1-e07c-489e-ab9b-4d77551877f0@kernel.org>
Date: Wed, 3 Dec 2025 21:14:44 +0100
From: "David Hildenbrand (Red Hat)" <david@...nel.org>
To: Gregory Price <gourry@...rry.net>
Cc: Frank van der Linden <fvdl@...gle.com>,
Johannes Weiner <hannes@...xchg.org>, linux-mm@...ck.org,
kernel-team@...a.com, linux-kernel@...r.kernel.org,
akpm@...ux-foundation.org, vbabka@...e.cz, surenb@...gle.com,
mhocko@...e.com, jackmanb@...gle.com, ziy@...dia.com, kas@...nel.org,
dave.hansen@...ux.intel.com, rick.p.edgecombe@...el.com,
muchun.song@...ux.dev, osalvador@...e.de, x86@...nel.org,
linux-coco@...ts.linux.dev, kvm@...r.kernel.org,
Wei Yang <richard.weiyang@...il.com>, David Rientjes <rientjes@...gle.com>,
Joshua Hahn <joshua.hahnjy@...il.com>
Subject: Re: [PATCH v4] page_alloc: allow migration of smaller hugepages
during contig_alloc
On 12/3/25 21:09, Gregory Price wrote:
> On Wed, Dec 03, 2025 at 08:43:29PM +0100, David Hildenbrand (Red Hat) wrote:
>> On 12/3/25 19:01, Frank van der Linden wrote:
>>>
>>> The PageHuge() check seems a bit out of place there, if you just
>>> removed it altogether you'd get the same results, right? The isolation
>>> code will deal with it. But sure, it does potentially avoid doing some
>>> unnecessary work.
>>
>> commit 4d73ba5fa710fe7d432e0b271e6fecd252aef66e
>> Author: Mel Gorman <mgorman@...hsingularity.net>
>> Date: Fri Apr 14 15:14:29 2023 +0100
>>
>> mm: page_alloc: skip regions with hugetlbfs pages when allocating 1G pages
>> A bug was reported by Yuanxi Liu where allocating 1G pages at runtime is
>> taking an excessive amount of time for large amounts of memory. Further
>> testing allocating huge pages that the cost is linear i.e. if allocating
>> 1G pages in batches of 10 then the time to allocate nr_hugepages from
>> 10->20->30->etc increases linearly even though 10 pages are allocated at
>> each step. Profiles indicated that much of the time is spent checking the
>> validity within already existing huge pages and then attempting a
>> migration that fails after isolating the range, draining pages and a whole
>> lot of other useless work.
>> Commit eb14d4eefdc4 ("mm,page_alloc: drop unnecessary checks from
>> pfn_range_valid_contig") removed two checks, one which ignored huge pages
>> for contiguous allocations as huge pages can sometimes migrate. While
>> there may be value on migrating a 2M page to satisfy a 1G allocation, it's
>> potentially expensive if the 1G allocation fails and it's pointless to try
>> moving a 1G page for a new 1G allocation or scan the tail pages for valid
>> PFNs.
>> Reintroduce the PageHuge check and assume any contiguous region with
>> hugetlbfs pages is unsuitable for a new 1G allocation.
>>
>
> Worth noting that because this check really only applies to gigantic
> page *reservation* (not faulting), this isn't necessarily incurred in a
> time critical path. So, maybe i'm biased here, the reliability increase
> feels like a win even if the operation can take a very long time under
> memory pressure scenarios (which seems like an outliar anyway).
Not sure I understand correctly. I think the fix from Mel was the right
thing to do.
It does not make sense to try migrating a 1GB page when allocating a 1GB
page. Ever.
--
Cheers
David
Powered by blists - more mailing lists