linux-kernel - Re: [PATCH v3] page_alloc: allow migration of smaller hugepages during contig

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aSDAq3Wj9XN2D9ER@gourry-fedora-PF4VCD3F>
Date: Fri, 21 Nov 2025 14:42:35 -0500
From: Gregory Price <gourry@...rry.net>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-mm@...ck.org, kernel-team@...a.com, vbabka@...e.cz,
	surenb@...gle.com, mhocko@...e.com, jackmanb@...gle.com,
	hannes@...xchg.org, ziy@...dia.com, linux-kernel@...r.kernel.org,
	David Hildenbrand <david@...hat.com>,
	Wei Yang <richard.weiyang@...il.com>,
	Oscar Salvador <osalvador@...e.de>,
	David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH v3] page_alloc: allow migration of smaller hugepages
 during contig_alloc

On Fri, Nov 21, 2025 at 11:31:38AM -0800, Andrew Morton wrote:
> On Fri, 21 Nov 2025 14:15:40 -0500 Gregory Price <gourry@...rry.net> wrote:
> 
> > We presently skip regions with hugepages entirely when trying to do
> > contiguous page allocation.  Instead, if hugepage migration is enabled,
> > consider regions with hugepages smaller than the target contiguous
> > allocation request as valid targets for allocation.
> 
> Why?  What benefit does this have to our users?
> 
> Some runtime testing results might be helpful?

If multiple types of hugepages are in use, alloc_contig is less reliable.
In particular when 2MB and 1GB HugeTLB pages are present on the same system.

The same logic is actually present in isolate_migrate_pages_block() as
pointed out by David  which is called in the stack from alloc_contig -
but it's unreachable because this filters those regions.

I allude to this in the second paragraph, but it is worth spelling out
explicitly.  Will update.

> 
> > isolate_migrate_pages_block() already expects requests with hugepages
> > to originate from alloc_contig, and hugetlb code also does a migratable
> > check when isolating in folio_isolate_hugetlb().
> > 
> > Suggested-by: David Hildenbrand <david@...hat.com>
> 
> A Link: here might be illuminating.

Ah, fair point

Link: https://lore.kernel.org/linux-mm/6fe3562d-49b2-4975-aa86-e139c535ad00@redhat.com/

"""
However, it also means that we won't try moving 2MB folios to free up a
1GB folio.

That could be supported by allowing for moving hugetlb folios when their
size is small enough to be served by the buddy, and the size we are
allocating is larger than the one of these folios.
"""

> 
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -6849,8 +6849,19 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
> >  		if (PageReserved(page))
> >  			return false;
> >  
> > -		if (PageHuge(page))
> > -			return false;
> > +		if (PageHuge(page)) {
> > +			unsigned int order;
> > +
> > +			if (!IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION))
> > +				return false;
> > +
> > +			/* Don't consider moving same size/larger pages */
> 
> Comment says "what" (which was fairly obvious).  Please reveal "why".
> 

ack.

> > +			page = compound_head(page);
> > +			order = compound_order(page);
> > +			if ((order >= MAX_FOLIO_ORDER) ||
> > +			    (nr_pages <= (1 << order)))
> > +				return false;
> > +		}
> >  	}
> >  	return true;
> >  }
>