[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <CTHF42QOSYR0.1Y16SFUF11F3X@wheely>
Date: Tue, 20 Jun 2023 20:49:56 +1000
From: "Nicholas Piggin" <npiggin@...il.com>
To: "Yu Zhao" <yuzhao@...gle.com>
Cc: "Andrew Morton" <akpm@...ux-foundation.org>,
"Paolo Bonzini" <pbonzini@...hat.com>,
"Alistair Popple" <apopple@...dia.com>,
"Anup Patel" <anup@...infault.org>,
"Ben Gardon" <bgardon@...gle.com>,
"Borislav Petkov" <bp@...en8.de>,
"Catalin Marinas" <catalin.marinas@....com>,
"Chao Peng" <chao.p.peng@...ux.intel.com>,
"Christophe Leroy" <christophe.leroy@...roup.eu>,
"Dave Hansen" <dave.hansen@...ux.intel.com>,
"Fabiano Rosas" <farosas@...ux.ibm.com>,
"Gaosheng Cui" <cuigaosheng1@...wei.com>,
"Gavin Shan" <gshan@...hat.com>, "H. Peter Anvin" <hpa@...or.com>,
"Ingo Molnar" <mingo@...hat.com>,
"James Morse" <james.morse@....com>,
"Jason A. Donenfeld" <Jason@...c4.com>,
"Jason Gunthorpe" <jgg@...pe.ca>,
"Jonathan Corbet" <corbet@....net>,
"Marc Zyngier" <maz@...nel.org>,
"Masami Hiramatsu" <mhiramat@...nel.org>,
"Michael Ellerman" <mpe@...erman.id.au>,
"Michael Larabel" <michael@...haellarabel.com>,
"Mike Rapoport" <rppt@...nel.org>,
"Oliver Upton" <oliver.upton@...ux.dev>,
"Paul Mackerras" <paulus@...abs.org>,
"Peter Xu" <peterx@...hat.com>,
"Sean Christopherson" <seanjc@...gle.com>,
"Steven Rostedt" <rostedt@...dmis.org>,
"Suzuki K Poulose" <suzuki.poulose@....com>,
"Thomas Gleixner" <tglx@...utronix.de>,
"Thomas Huth" <thuth@...hat.com>, "Will Deacon" <will@...nel.org>,
"Zenghui Yu" <yuzenghui@...wei.com>, <kvmarm@...ts.linux.dev>,
<kvm@...r.kernel.org>, <linux-arm-kernel@...ts.infradead.org>,
<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<linux-mm@...ck.org>, <linuxppc-dev@...ts.ozlabs.org>,
<linux-trace-kernel@...r.kernel.org>, <x86@...nel.org>,
<linux-mm@...gle.com>
Subject: Re: [PATCH mm-unstable v2 06/10] kvm/powerpc: make radix page
tables RCU safe
On Tue Jun 20, 2023 at 6:00 PM AEST, Yu Zhao wrote:
> On Tue, Jun 20, 2023 at 12:33 AM Nicholas Piggin <npiggin@...il.com> wrote:
> >
> > On Sat May 27, 2023 at 9:44 AM AEST, Yu Zhao wrote:
> > > KVM page tables are currently not RCU safe against remapping, i.e.,
> > > kvmppc_unmap_free_pmd_entry_table() et al. The previous
> >
> > Minor nit but the "page table" is not RCU-safe against something. It
> > is RCU-freed, and therefore some algorithm that accesses it can have
> > the existence guarantee provided by RCU (usually there still needs
> > to be more to it).
> >
> > > mmu_notifier_ops members rely on kvm->mmu_lock to synchronize with
> > > that operation.
> > >
> > > However, the new mmu_notifier_ops member test_clear_young() provides
> > > a fast path that does not take kvm->mmu_lock. To implement
> > > kvm_arch_test_clear_young() for that path, orphan page tables need to
> > > be freed by RCU.
> >
> > Short version: clear the referenced bit using RCU instead of MMU lock
> > to protect against page table freeing, and there is no problem with
> > clearing the bit in a table that has been freed.
> >
> > Seems reasonable.
>
> Thanks. All above points taken.
>
> > > Unmapping, specifically kvm_unmap_radix(), does not free page tables,
> > > hence not a concern.
> >
> > Not sure if you really need to make the distinction about why the page
> > table is freed, we might free them via unmapping. The point is just
> > anything that frees them while there can be concurrent access, right?
>
> Correct.
>
> > > Signed-off-by: Yu Zhao <yuzhao@...gle.com>
> > > ---
> > > arch/powerpc/kvm/book3s_64_mmu_radix.c | 6 ++++--
> > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > > index 461307b89c3a..3b65b3b11041 100644
> > > --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > > +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > > @@ -1469,13 +1469,15 @@ int kvmppc_radix_init(void)
> > > {
> > > unsigned long size = sizeof(void *) << RADIX_PTE_INDEX_SIZE;
> > >
> > > - kvm_pte_cache = kmem_cache_create("kvm-pte", size, size, 0, pte_ctor);
> > > + kvm_pte_cache = kmem_cache_create("kvm-pte", size, size,
> > > + SLAB_TYPESAFE_BY_RCU, pte_ctor);
> > > if (!kvm_pte_cache)
> > > return -ENOMEM;
> > >
> > > size = sizeof(void *) << RADIX_PMD_INDEX_SIZE;
> > >
> > > - kvm_pmd_cache = kmem_cache_create("kvm-pmd", size, size, 0, pmd_ctor);
> > > + kvm_pmd_cache = kmem_cache_create("kvm-pmd", size, size,
> > > + SLAB_TYPESAFE_BY_RCU, pmd_ctor);
> > > if (!kvm_pmd_cache) {
> > > kmem_cache_destroy(kvm_pte_cache);
> > > return -ENOMEM;
> >
> > KVM PPC HV radix PUD level page tables use the arch/powerpc allocators
> > (for some reason), which are not RCU freed. I think you need them too?
>
> We don't. The use of the arch/powerpc allocator for PUD tables seems
> appropriate to me because, unlike PMD/PTE tables, we never free PUD
> tables during the lifetime of a VM:
Ah you're right, the pud_free only comes from the double alloc case
so it's never visible to concurrent threads.
> * We don't free PUD/PMD/PTE tables when they become empty, i.e., not
> mapping any pages but still attached. (We could in theory, as
> x86/aarch64 do.)
We may try to do that at some point, but that's not related to your
patch for now so no worries.
Thanks,
Nick
Powered by blists - more mailing lists