[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOUHufY8egkNrxQwd6ms4j6ziyUW5uDjD=yhkxHLqAAOGB4Ccw@mail.gmail.com>
Date: Tue, 20 Jun 2023 02:00:46 -0600
From: Yu Zhao <yuzhao@...gle.com>
To: Nicholas Piggin <npiggin@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Alistair Popple <apopple@...dia.com>,
Anup Patel <anup@...infault.org>,
Ben Gardon <bgardon@...gle.com>,
Borislav Petkov <bp@...en8.de>,
Catalin Marinas <catalin.marinas@....com>,
Chao Peng <chao.p.peng@...ux.intel.com>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Fabiano Rosas <farosas@...ux.ibm.com>,
Gaosheng Cui <cuigaosheng1@...wei.com>,
Gavin Shan <gshan@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
James Morse <james.morse@....com>,
"Jason A. Donenfeld" <Jason@...c4.com>,
Jason Gunthorpe <jgg@...pe.ca>,
Jonathan Corbet <corbet@....net>,
Marc Zyngier <maz@...nel.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Michael Ellerman <mpe@...erman.id.au>,
Michael Larabel <michael@...haellarabel.com>,
Mike Rapoport <rppt@...nel.org>,
Oliver Upton <oliver.upton@...ux.dev>,
Paul Mackerras <paulus@...abs.org>,
Peter Xu <peterx@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
Steven Rostedt <rostedt@...dmis.org>,
Suzuki K Poulose <suzuki.poulose@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Thomas Huth <thuth@...hat.com>, Will Deacon <will@...nel.org>,
Zenghui Yu <yuzenghui@...wei.com>, kvmarm@...ts.linux.dev,
kvm@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linuxppc-dev@...ts.ozlabs.org,
linux-trace-kernel@...r.kernel.org, x86@...nel.org,
linux-mm@...gle.com
Subject: Re: [PATCH mm-unstable v2 06/10] kvm/powerpc: make radix page tables
RCU safe
On Tue, Jun 20, 2023 at 12:33 AM Nicholas Piggin <npiggin@...il.com> wrote:
>
> On Sat May 27, 2023 at 9:44 AM AEST, Yu Zhao wrote:
> > KVM page tables are currently not RCU safe against remapping, i.e.,
> > kvmppc_unmap_free_pmd_entry_table() et al. The previous
>
> Minor nit but the "page table" is not RCU-safe against something. It
> is RCU-freed, and therefore some algorithm that accesses it can have
> the existence guarantee provided by RCU (usually there still needs
> to be more to it).
>
> > mmu_notifier_ops members rely on kvm->mmu_lock to synchronize with
> > that operation.
> >
> > However, the new mmu_notifier_ops member test_clear_young() provides
> > a fast path that does not take kvm->mmu_lock. To implement
> > kvm_arch_test_clear_young() for that path, orphan page tables need to
> > be freed by RCU.
>
> Short version: clear the referenced bit using RCU instead of MMU lock
> to protect against page table freeing, and there is no problem with
> clearing the bit in a table that has been freed.
>
> Seems reasonable.
Thanks. All above points taken.
> > Unmapping, specifically kvm_unmap_radix(), does not free page tables,
> > hence not a concern.
>
> Not sure if you really need to make the distinction about why the page
> table is freed, we might free them via unmapping. The point is just
> anything that frees them while there can be concurrent access, right?
Correct.
> > Signed-off-by: Yu Zhao <yuzhao@...gle.com>
> > ---
> > arch/powerpc/kvm/book3s_64_mmu_radix.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > index 461307b89c3a..3b65b3b11041 100644
> > --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > @@ -1469,13 +1469,15 @@ int kvmppc_radix_init(void)
> > {
> > unsigned long size = sizeof(void *) << RADIX_PTE_INDEX_SIZE;
> >
> > - kvm_pte_cache = kmem_cache_create("kvm-pte", size, size, 0, pte_ctor);
> > + kvm_pte_cache = kmem_cache_create("kvm-pte", size, size,
> > + SLAB_TYPESAFE_BY_RCU, pte_ctor);
> > if (!kvm_pte_cache)
> > return -ENOMEM;
> >
> > size = sizeof(void *) << RADIX_PMD_INDEX_SIZE;
> >
> > - kvm_pmd_cache = kmem_cache_create("kvm-pmd", size, size, 0, pmd_ctor);
> > + kvm_pmd_cache = kmem_cache_create("kvm-pmd", size, size,
> > + SLAB_TYPESAFE_BY_RCU, pmd_ctor);
> > if (!kvm_pmd_cache) {
> > kmem_cache_destroy(kvm_pte_cache);
> > return -ENOMEM;
>
> KVM PPC HV radix PUD level page tables use the arch/powerpc allocators
> (for some reason), which are not RCU freed. I think you need them too?
We don't. The use of the arch/powerpc allocator for PUD tables seems
appropriate to me because, unlike PMD/PTE tables, we never free PUD
tables during the lifetime of a VM:
* We don't free PUD/PMD/PTE tables when they become empty, i.e., not
mapping any pages but still attached. (We could in theory, as
x86/aarch64 do.)
* We have to free PMD/PTE tables when we replace them with 1GB/2MB
pages. (Otherwise we'd lose track of detached tables.) And we
currently don't support huge pages at P4D level, so we never detach
and free PUD tables.
Powered by blists - more mailing lists