[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240325161919.GD6245@nvidia.com>
Date: Mon, 25 Mar 2024 13:19:19 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Christophe Leroy <christophe.leroy@...roup.eu>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Peter Xu <peterx@...hat.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linuxppc-dev@...ts.ozlabs.org
Subject: Re: [RFC PATCH 1/8] mm: Provide pagesize to pmd_populate()
On Mon, Mar 25, 2024 at 03:55:54PM +0100, Christophe Leroy wrote:
> Unlike many architectures, powerpc 8xx hardware tablewalk requires
> a two level process for all page sizes, allthough second level only
> has one entry when pagesize is 8M.
>
> To fit with Linux page table topology and without requiring special
> page directory layout like hugepd, the page entry will be replicated
> 1024 times in the standard page table. However for large pages it is
> necessary to set bits in the level-1 (PMD) entry. At the time being,
> for 512k pages the flag is kept in the PTE and inserted in the PMD
> entry at TLB miss exception, that is necessary because we can have
> pages of different sizes in a page table. However the 12 PTE bits are
> fully used and there is no room for an additional bit for page size.
>
> For 8M pages, there will be only one page per PMD entry, it is
> therefore possible to flag the pagesize in the PMD entry, with the
> advantage that the information will already be at the right place for
> the hardware.
>
> To do so, add a new helper called pmd_populate_size() which takes the
> page size as an additional argument, and modify __pte_alloc() to also
> take that argument. pte_alloc() is left unmodified in order to
> reduce churn on callers, and a pte_alloc_size() is added for use by
> pte_alloc_huge().
>
> When an architecture doesn't provide pmd_populate_size(),
> pmd_populate() is used as a fallback.
I think it would be a good idea to document what the semantic is
supposed to be for sz?
Just a general remark, probably nothing for this, but with these new
arguments the historical naming seems pretty tortured for
pte_alloc_size().. Something like pmd_populate_leaf(size) as a naming
scheme would make this more intuitive. Ie pmd_populate_leaf() gives
you a PMD entry where the entry points to a leaf page table able to
store folios of at least size.
Anyhow, I thought the edits to the mm helpers were fine, certainly
much nicer than hugepd. Do you see a path to remove hugepd entirely
from here?
Thanks,
Jason
Powered by blists - more mailing lists