[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01A47A14-3B5F-408B-AC37-64E36AFCF14C@nvidia.com>
Date: Thu, 18 Dec 2025 16:17:14 -0500
From: Zi Yan <ziy@...dia.com>
To: Gregory Price <gourry@...rry.net>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org, kernel-team@...a.com,
akpm@...ux-foundation.org, vbabka@...e.cz, surenb@...gle.com,
mhocko@...e.com, jackmanb@...gle.com, hannes@...xchg.org,
richard.weiyang@...il.com, osalvador@...e.de, rientjes@...gle.com,
david@...hat.com, joshua.hahnjy@...il.com, fvdl@...gle.com
Subject: Re: [PATCH v5] page_alloc: allow migration of smaller hugepages
during contig_alloc
On 18 Dec 2025, at 15:42, Gregory Price wrote:
> On Thu, Dec 18, 2025 at 02:45:37PM -0500, Zi Yan wrote:
>>
>> That can save another scan? And caller can pass hugetlb_search_result if
>> they care and check its value if pfn_range_valid_contig() returns false.
>>
>
> Well, first, I've generally seen it discouraged to do output-parameters
> like this for such trivial things. But that aside...
>
> We have to scan again either way if we want to prefer allocating
> non-hugetlb regions in different memory blocks first. This is what Mel
> was pointing out (we should touch every OTHER block before we attempt
> HugeTLB migrations).
OK, you assume hugetlb is harder to migrate compared to other movable pages.
Considering the limited number of hugetlb pages, it is quite possible.
Anyway, I will wait for your v6. Thank you for the explanation and the
prototype below.
>
> The best optimization you could hope for is something like the following
> - but honestly, this is ugly, racy (zone contents may have changed
> between scans), and if you're already in the slow reliable path then we
> should just be slow and re-scan the non-hugetlb sections as well.
>
> Other than this being ugly, I don't have strong feelings. If people
> would prefer the second pass to ONLY touch hugetlb sections, I'll ship
> this.
>
> static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
> unsigned long nr_pages, bool search_hugetlb,
> bool *hugetlb_found)
> {
> bool hugetlb = false;
>
> for (i = start_pfn; i < end_pfn; i++) {
> ...
> if (PageHuge(page)) {
> if (hugetlb_found)
> *hugetlb_found = true;
>
> if (!search_hugetlb)
> return false;
>
> ...
> hugetlb = true;
> }
> }
> /*
> * If we're searching for hugetlb regions, only return those
> * Otherwise only return regions without hugetlb reservations
> */
> return !search_hugetlb || hugetlb;
> }
>
>
> struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
> int nid, nodemask_t *nodemask)
> {
> bool search_hugetlb = false;
> bool hugetlb_found = false;
>
> retry:
> zonelist = node_zonelist(nid, gfp_mask);
> for_each_zone_zonelist_nodemask(zone, z, zonelist,
> gfp_zone(gfp_mask), nodemask) {
> spin_lock_irqsave(&zone->lock, flags);
>
> pfn = ALIGN(zone->zone_start_pfn, nr_pages);
> while (zone_spans_last_pfn(zone, pfn, nr_pages)) {
> if (pfn_range_valid_contig(zone, pfn, nr_pages,
> search_hugetlb,
> &hugetlb_found)) {
> ...
> }
> }
> if (IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION) &&
> !search_hugetlb && hugetlb_found) {
> search_hugetlb = true;
> goto retry;
> }
> return NULL;
> }
>
> ~Gregory
Best Regards,
Yan, Zi
Powered by blists - more mailing lists