[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGtprH8hk0akN1pbxO95O_AyHf-BDN5tOH-Lbxg-qcTMg-27UQ@mail.gmail.com>
Date: Wed, 10 Dec 2025 17:42:44 -0800
From: Vishal Annapurve <vannapurve@...gle.com>
To: Yan Zhao <yan.y.zhao@...el.com>
Cc: Sagi Shahar <sagis@...gle.com>, pbonzini@...hat.com, seanjc@...gle.com,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org, x86@...nel.org,
rick.p.edgecombe@...el.com, dave.hansen@...el.com, kas@...nel.org,
tabba@...gle.com, ackerleytng@...gle.com, quic_eberman@...cinc.com,
michael.roth@....com, david@...hat.com, vbabka@...e.cz,
thomas.lendacky@....com, pgonda@...gle.com, zhiquan1.li@...el.com,
fan.du@...el.com, jun.miao@...el.com, ira.weiny@...el.com,
isaku.yamahata@...el.com, xiaoyao.li@...el.com, binbin.wu@...ux.intel.com,
chao.p.peng@...el.com
Subject: Re: [RFC PATCH v2 21/23] KVM: TDX: Preallocate PAMT pages to be used
in split path
On Sun, Dec 7, 2025 at 9:51 PM Yan Zhao <yan.y.zhao@...el.com> wrote:
>
> > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > index 6b6c46c27390..508b133df903 100644
> > > --- a/arch/x86/include/asm/kvm_host.h
> > > +++ b/arch/x86/include/asm/kvm_host.h
> > > @@ -1591,6 +1591,8 @@ struct kvm_arch {
> > > #define SPLIT_DESC_CACHE_MIN_NR_OBJECTS (SPTE_ENT_PER_PAGE + 1)
> > > struct kvm_mmu_memory_cache split_desc_cache;
> > >
> > > + struct kvm_mmu_memory_cache pamt_page_cache;
> > > +
> >
> > The latest DPAMT patches use a per-vcpu tdx_prealloc struct to handle
> > preallocating pages for pamt. I'm wondering if you've considered how
> > this would work here since some of the calls requiring pamt originate
> > from user space ioctls and therefore are not associated with a vcpu.
> I'll use a per-VM tdx_prealloc struct for splitting here, similar to the
> per-VM pamt_page_cache.
>
> diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h
> index 43dd295b7fd6..91bea25da528 100644
> --- a/arch/x86/kvm/vmx/tdx.h
> +++ b/arch/x86/kvm/vmx/tdx.h
> @@ -48,6 +48,9 @@ struct kvm_tdx {
> * Set/unset is protected with kvm->mmu_lock.
> */
> bool wait_for_sept_zap;
> +
> + spinlock_t external_kvm_split_cache_lock;
> + struct tdx_prealloc prealloc_split_cache;
> };
>
> > Since the tdx_prealloc is a per vcpu struct there are no race issues
> > when multiple vcpus need to add pamt pages but here it would be
> > trickier here because theoretically, multiple threads could split
> > different pages simultaneously.
> A spin lock external_kvm_split_cache_lock is introduced to protect the cache
> enqueue and dequeue.
> (a) When tdp_mmu_split_huge_pages_root() is invoked under write mmu_lock:
> - Since cache dequeue is already under write mmu_lock in
> tdp_mmu_split_huge_page()-->tdx_sept_split_private_spte(), acquiring/
> releasing another spin lock doesn't matter.
> - Though the cache enqueue in topup_external_split_cache() is not protected
> by mmu_lock, protecting enqueue with a spinlock should not reduce
> concurrency.
Even with the spin lock protecting the cache topup/consumption
operation, is it possible that one split operation context consumes
the top-up performed by the other split operation causing failure with
the subsequent consumptions?
>
> (b) When tdp_mmu_split_huge_pages_root() is invoked under read mmu_lock:
> Introducing a new spinlock may hurt concurrency for a brief duration (which
> is necessary).
> However, there's no known (future) use case for multiple threads invoking
> tdp_mmu_split_huge_pages_root() on mirror root under shared mmu_lock.
>
> For future splitting under shared mmu_lock in the fault path, we'll use
> the per-vCPU tdx_prealloc instead of the per-VM cache. TDX can leverage
> kvm_get_running_vcpu() to differentiate between the two caches.
>
>
Powered by blists - more mailing lists