[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZmhofWIiMC3I0aMF@localhost.localdomain>
Date: Tue, 11 Jun 2024 17:08:45 +0200
From: Oscar Salvador <osalvador@...e.de>
To: Peter Xu <peterx@...hat.com>
Cc: Christophe Leroy <christophe.leroy@...roup.eu>,
Andrew Morton <akpm@...ux-foundation.org>,
Jason Gunthorpe <jgg@...dia.com>,
Michael Ellerman <mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH v5 02/18] mm: Define __pte_leaf_size() to also take a PMD
entry
On Tue, Jun 11, 2024 at 10:17:30AM -0400, Peter Xu wrote:
> Oscar,
>
> On Tue, Jun 11, 2024 at 11:34:23AM +0200, Oscar Salvador wrote:
> > Which means that they would be caught in the following code:
> >
> > ptl = pmd_huge_lock(pmd, vma);
> > if (ptl) {
> > - 8MB hugepages will be handled here
> > smaps_pmd_entry(pmd, addr, walk);
> > spin_unlock(ptl);
> > }
> > /* pte stuff */
> > ...
>
> Just one quick comment: I think there's one challenge though as this is
> also not a generic "pmd leaf", but a pgtable page underneath. I think it
> means smaps_pmd_entry() won't trivially work here, e.g., it will start to
> do this:
>
> if (pmd_present(*pmd)) {
> page = vm_normal_page_pmd(vma, addr, *pmd);
>
> Here vm_normal_page_pmd() will only work if pmd_leaf() satisfies its
> definition as:
>
> * - It should contain a huge PFN, which points to a huge page larger than
> * PAGE_SIZE of the platform. The PFN format isn't important here.
>
> But now it's a pgtable page, containing cont-ptes. Similarly, I think most
> pmd_*() helpers will stop working there if we report it as a leaf.
Heh, I think I managed to confuse myself.
I do not why but I thought that
static inline pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
{
if (ptep_is_8m_pmdp(mm, addr, ptep))
ptep = pte_offset_kernel((pmd_t *)ptep, 0);
return ptep_get(ptep);
}
would return the address of the pmd for 8MB hugepages, but it will
return the address of the first pte?
Then yeah, this will not work as I thought.
The problem is that we do not have spare bits for 8xx to mark these ptes
as cont-ptes or mark them pte as 8MB, so I do not see a clear path on how
we could remove huge_ptep_get for 8xx.
I am really curious though how we handle that for THP? Or THP on 8xx
does not support that size?
--
Oscar Salvador
SUSE Labs
Powered by blists - more mailing lists