[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z5FgSn0y/c9pItED@li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com>
Date: Wed, 22 Jan 2025 22:16:58 +0100
From: Alexander Gordeev <agordeev@...ux.ibm.com>
To: Heiko Carstens <hca@...ux.ibm.com>, Kevin Brodsky <kevin.brodsky@....com>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Catalin Marinas <catalin.marinas@....com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Linus Walleij <linus.walleij@...aro.org>,
Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
"Mike Rapoport (IBM)" <rppt@...nel.org>,
Ryan Roberts <ryan.roberts@....com>,
Thomas Gleixner <tglx@...utronix.de>, Will Deacon <will@...nel.org>,
Matthew Wilcox <willy@...radead.org>,
Qi Zheng <zhengqi.arch@...edance.com>, linux-alpha@...r.kernel.org,
linux-arch@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-csky@...r.kernel.org, linux-hexagon@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-m68k@...ts.linux-m68k.org,
linux-mips@...r.kernel.org, linux-openrisc@...r.kernel.org,
linux-parisc@...r.kernel.org, linux-riscv@...ts.infradead.org,
linux-s390@...r.kernel.org, linux-snps-arc@...ts.infradead.org,
linux-um@...ts.infradead.org, loongarch@...ts.linux.dev,
x86@...nel.org
Subject: Re: [PATCH v2 6/6] mm: Introduce ctor/dtor at PGD level
On Wed, Jan 22, 2025 at 03:06:05PM +0100, Alexander Gordeev wrote:
Hi Kevin,
> On Wed, Jan 22, 2025 at 08:49:54AM +0100, Heiko Carstens wrote:
> > > > static inline pgd_t *pgd_alloc(struct mm_struct *mm)
> > > > {
> > > > - return (pgd_t *) crst_table_alloc(mm);
> > > > + unsigned long *table = crst_table_alloc(mm);
> > > > +
> > > > + if (!table)
> > > > + return NULL;
> > >
> > > I do not know status of this series, but FWIW, this call is missed:
> > >
> > > crst_table_init(table, _REGION1_ENTRY_EMPTY);
> >
> > Why is that missing?
>
> Because the follow-up pagetable_pgd_ctor() is called against uninitialized
> page table, while other pagetable_pXd_ctor() variants are called against
> initialized one. I could imagine complications as result of that.
>
> Whether Region1 table is the right choice is a big question though, as you
> noticed below.
As discussed with Heiko, we do not want to add the extra crst_table_init() call
at least due to performance impact. So please ignore my comment.
> > A pgd table can be a Region1, Region2, or Region3 table. The only caller of
> > this function is mm_init() via mm_alloc_pgd(); and right after mm_alloc_pgd()
> > there is a call to init_new_context() which will initialize the pgd correctly.
>
> init_new_context() is in a way a constructor as well, so whole thing looks odd
> to me. But I do not immedeately see a better way :(
>
> > I guess what really gets odd, and might be broken (haven't checked yet) is
> > what happens on dynamic upgrade of page table levels (->crst_table_upgrade()).
>
> Hmm, that is a good point.
>
> > With that a pgd may become a pud, and with that we get an imbalance with
> > the ctor/dtor calls for the various page table levels when they get freed
>
> The ctor/dtor mismatch should not be a problem, as pagetable_pgd|p4d|pud_ctor()
> are the same and there is one pagetable_dtor() for all top levels as of now.
> But if it ever comes to separate implementations, then we are in the world
> of pain.
>
> > again. Plus, at first glance, it looks also broken that we have open-coded
> > crst_alloc() calls instead of using the "proper" page table allocation API
> > within crst_table_upgrade(), which again would cause an imbalance.
>
> This is a good point too.
The below bits are seems to be missed. We will test it and send a patch.
diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c
index a4e761902093..d33f55b7ee98 100644
--- a/arch/s390/mm/pgalloc.c
+++ b/arch/s390/mm/pgalloc.c
@@ -88,12 +88,14 @@ int crst_table_upgrade(struct mm_struct *mm, unsigned long end)
if (unlikely(!p4d))
goto err_p4d;
crst_table_init(p4d, _REGION2_ENTRY_EMPTY);
+ pagetable_p4d_ctor(virt_to_ptdesc(p4d));
}
if (end > _REGION1_SIZE) {
pgd = crst_table_alloc(mm);
if (unlikely(!pgd))
goto err_pgd;
crst_table_init(pgd, _REGION1_ENTRY_EMPTY);
+ pagetable_pgd_ctor(virt_to_ptdesc(pgd));
}
spin_lock_bh(&mm->page_table_lock);
@@ -130,6 +132,7 @@ int crst_table_upgrade(struct mm_struct *mm, unsigned long end)
return 0;
err_pgd:
+ pagetable_dtor(virt_to_ptdesc(p4d));
crst_table_free(mm, p4d);
err_p4d:
return -ENOMEM;
Thanks!
Powered by blists - more mailing lists