[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aYXAdJV8rvWn4EQf@yzhao56-desk.sh.intel.com>
Date: Fri, 6 Feb 2026 18:20:36 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
CC: Thomas Gleixner <tglx@...nel.org>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
<x86@...nel.org>, Kiryl Shutsemau <kas@...nel.org>, Paolo Bonzini
<pbonzini@...hat.com>, <linux-kernel@...r.kernel.org>,
<linux-coco@...ts.linux.dev>, <kvm@...r.kernel.org>, Kai Huang
<kai.huang@...el.com>, Rick Edgecombe <rick.p.edgecombe@...el.com>, "Vishal
Annapurve" <vannapurve@...gle.com>, Ackerley Tng <ackerleytng@...gle.com>,
Sagi Shahar <sagis@...gle.com>, Binbin Wu <binbin.wu@...ux.intel.com>,
Xiaoyao Li <xiaoyao.li@...el.com>, Isaku Yamahata <isaku.yamahata@...el.com>
Subject: Re: [RFC PATCH v5 22/45] KVM: TDX: Get/put PAMT pages when
(un)mapping private memory
On Wed, Jan 28, 2026 at 05:14:54PM -0800, Sean Christopherson wrote:
> From: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
>
> Add Dynamic PAMT support to KVM's S-EPT MMU by "getting" a PAMT page when
> adding guest memory (PAGE.ADD or PAGE.AUG), and "putting" the page when
> removing guest memory (PAGE.REMOVE).
>
> To access the per-vCPU PAMT caches without plumbing @vcpu throughout the
> TDP MMU, begrudginly use kvm_get_running_vcpu() to get the vCPU, and bug
> the VM If KVM attempts to set an S-EPT without an active vCPU. KVM only
> supports creating _new_ mappings in page (pre)fault paths, all of which
> require an active vCPU.
>
> The PAMT memory holds metadata for TDX-protected memory. With Dynamic
> PAMT, PAMT_4K is allocated on demand. The kernel supplies the TDX module
> with a few pages that cover 2M of host physical memory.
>
> PAMT memory can be reclaimed when the last user is gone. It can happen
> in a few code paths:
>
> - On TDH.PHYMEM.PAGE.RECLAIM in tdx_reclaim_td_control_pages() and
> tdx_reclaim_page().
>
> - On TDH.MEM.PAGE.REMOVE in tdx_sept_drop_private_spte().
>
> - In tdx_sept_zap_private_spte() for pages that were in the queue to be
> added with TDH.MEM.PAGE.ADD, but it never happened due to an error.
>
> - In tdx_sept_free_private_spt() for SEPT pages;
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> [Minor log tweak]
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@...el.com>
> Co-developed-by: Sean Christopherson <seanjc@...gle.com>
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
> arch/x86/include/asm/kvm-x86-ops.h | 1 +
> arch/x86/include/asm/kvm_host.h | 1 +
> arch/x86/kvm/mmu/mmu.c | 4 +++
> arch/x86/kvm/vmx/tdx.c | 44 ++++++++++++++++++++++++++----
> arch/x86/kvm/vmx/tdx.h | 2 ++
> 5 files changed, 47 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index 17dddada69fc..394dc29483a7 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -99,6 +99,7 @@ KVM_X86_OP_OPTIONAL(free_external_sp)
> KVM_X86_OP_OPTIONAL_RET0(set_external_spte)
> KVM_X86_OP_OPTIONAL(remove_external_spte)
> KVM_X86_OP_OPTIONAL(reclaim_external_sp)
> +KVM_X86_OP_OPTIONAL_RET0(topup_external_cache)
> KVM_X86_OP(has_wbinvd_exit)
> KVM_X86_OP(get_l2_tsc_offset)
> KVM_X86_OP(get_l2_tsc_multiplier)
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 6e84dbc89e79..a6e4ab76b1b2 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1863,6 +1863,7 @@ struct kvm_x86_ops {
> struct kvm_mmu_page *sp);
> void (*remove_external_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level level,
> u64 mirror_spte);
> + int (*topup_external_cache)(struct kvm_vcpu *vcpu, int min);
>
>
> bool (*has_wbinvd_exit)(void);
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 9b5a6861e2a4..4ecbf216d96f 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -605,6 +605,10 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect)
> PT64_ROOT_MAX_LEVEL);
> if (r)
> return r;
> +
> + r = kvm_x86_call(topup_external_cache)(vcpu, PT64_ROOT_MAX_LEVEL);
If this external cache is for PAMT pages allocation for guest pages only, here
the min count should be 1 instead of PT64_ROOT_MAX_LEVEL?
> + if (r)
> + return r;
> }
> r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_shadow_page_cache,
> PT64_ROOT_MAX_LEVEL);
...
> void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode,
> @@ -3614,5 +3641,12 @@ void __init tdx_hardware_setup(void)
> vt_x86_ops.set_external_spte = tdx_sept_set_private_spte;
> vt_x86_ops.reclaim_external_sp = tdx_sept_reclaim_private_sp;
> vt_x86_ops.remove_external_spte = tdx_sept_remove_private_spte;
> +
> + /*
> + * FIXME: Wire up the PAMT hook iff DPAMT is supported, once VMXON is
> + * moved out of KVM and tdx_bringup() is folded into here.
> + */
> + vt_x86_ops.topup_external_cache = tdx_topup_external_pamt_cache;
> +
> vt_x86_ops.protected_apic_has_interrupt = tdx_protected_apic_has_interrupt;
> }
Powered by blists - more mailing lists