[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <027b6ac9-836d-4f89-a819-e24d487f9c8e@kernel.org>
Date: Thu, 13 Nov 2025 19:44:31 +0100
From: Christophe Leroy <chleroy@...nel.org>
To: "David Hildenbrand (Red Hat)" <david@...nel.org>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
Sourabh Jain <sourabhjain@...ux.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"Ritesh Harjani (IBM)" <ritesh.list@...il.com>,
Madhavan Srinivasan <maddy@...ux.ibm.com>, Donet Tom
<donettom@...ux.ibm.com>, Michael Ellerman <mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka
<vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>
Subject: Re: [PATCH v1] mm: fix MAX_FOLIO_ORDER on powerpc configs with
hugetlb
Le 13/11/2025 à 16:21, David Hildenbrand (Red Hat) a écrit :
> On 13.11.25 14:01, Lorenzo Stoakes wrote:
>
> [...]
>
>>> @@ -137,6 +137,7 @@ config PPC
>>> select ARCH_HAS_DMA_OPS if PPC64
>>> select ARCH_HAS_FORTIFY_SOURCE
>>> select ARCH_HAS_GCOV_PROFILE_ALL
>>> + select ARCH_HAS_GIGANTIC_PAGE if ARCH_SUPPORTS_HUGETLBFS
>>
>> Given we know the architecture can support it (presumably all powerpc
>> arches or all that can support hugetlbfs anyway?), this seems reasonable.
>
> powerpc allows for quite some different configs, so I assume there are
> some configs that don't allow ARCH_SUPPORTS_HUGETLBFS.
Yes indeed. For instance the powerpc 603 and 604 have no huge pages.
>
> [...]
>
>>> /*
>>> * There is no real limit on the folio size. We limit them to the
>>> maximum we
>>> - * currently expect (e.g., hugetlb, dax).
>>> + * currently expect: with hugetlb, we expect no folios larger than
>>> 16 GiB.
>>
>> Maybe worth saying 'see CONFIG_HAVE_GIGANTIC_FOLIOS definition' or
>> something?
>
> To me that's implied from the initial ifdef. But not strong opinion
> about spelling that out.
>
>>
>>> + */
>>> +#define MAX_FOLIO_ORDER get_order(SZ_16G)
>>
>> Hmm, is the base page size somehow runtime adjustable on powerpc? Why
>> isn't
>> PUD_ORDER good enough here?
>
> We tried P4D_ORDER but even that doesn't work. I think we effectively
> end up with cont-pmd/cont-PUD mappings (or even cont-p4d, I am not 100%
> sure because the folding code complicates that).
>
> See powerpcs variant of huge_pte_alloc() where we have stuff like
>
> p4d = p4d_offset(pgd_offset(mm, addr), addr);
> if (!mm_pud_folded(mm) && sz >= P4D_SIZE)
> return (pte_t *)p4d;
>
> As soon as we go to things like P4D_ORDER we're suddenly in the range of
> 512 GiB on x86 etc, so that's also not what we want as an easy fix. (and
> it didn't work)
>
On 32 bits there are only PGDIR et Page Table,
PGDIR_SHIFT = P4D_SHIFT = PUD_SHIFT = PMD_SHIFT
For instance on powerpc 8xx,
PGDIR_SIZE is 4M
Largest hugepage is 8M.
So even PGDIR_ORDER isn't enough.
Christophe
Powered by blists - more mailing lists