lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aYrsviTu/ET8N7DH@yzhao56-desk.sh.intel.com>
Date: Tue, 10 Feb 2026 16:30:54 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
CC: Thomas Gleixner <tglx@...nel.org>, Ingo Molnar <mingo@...hat.com>,
	Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
	<x86@...nel.org>, Kiryl Shutsemau <kas@...nel.org>, Paolo Bonzini
	<pbonzini@...hat.com>, <linux-kernel@...r.kernel.org>,
	<linux-coco@...ts.linux.dev>, <kvm@...r.kernel.org>, Kai Huang
	<kai.huang@...el.com>, Rick Edgecombe <rick.p.edgecombe@...el.com>, "Vishal
 Annapurve" <vannapurve@...gle.com>, Ackerley Tng <ackerleytng@...gle.com>,
	Sagi Shahar <sagis@...gle.com>, Binbin Wu <binbin.wu@...ux.intel.com>,
	Xiaoyao Li <xiaoyao.li@...el.com>, Isaku Yamahata <isaku.yamahata@...el.com>
Subject: Re: [RFC PATCH v5 20/45] KVM: x86/mmu: Allocate/free S-EPT pages
 using tdx_{alloc,free}_control_page()

On Mon, Feb 09, 2026 at 03:20:38PM -0800, Sean Christopherson wrote:
> On Mon, Feb 09, 2026, Yan Zhao wrote:
> > On Fri, Feb 06, 2026 at 07:01:14AM -0800, Sean Christopherson wrote:
> > > @@ -2348,7 +2348,7 @@ void __tdx_pamt_put(u64 pfn)
> > >         if (!atomic_dec_and_test(pamt_refcount))
> > >                 return;
> > >  
> > > -       scoped_guard(spinlock, &pamt_lock) {
> > > +       scoped_guard(raw_spinlock_irqsave, &pamt_lock) {
> > >                 /* Lost race with tdx_pamt_get(). */
> > >                 if (atomic_read(pamt_refcount))
> > >                         return;
> > 
> > This option can get rid of the warning.
> > 
> > However, given the pamt_lock is a global lock, which may be acquired even in the
> > softirq context, not sure if this irq disabled version is good.
> 
> FWIW, the SEAMCALL itself disables IRQs (and everything else), so it's not _that_
> big of a change.  But yeah, waiting on the spinlock with IRQs disabled isn't
> exactly idea.
Right. Though the SEAMCALL itself disables IRQs (which is no more than 18us from
my measurement), the time spent waiting for acquiring the spinlock with IRQs
disabled may scale with the number of contending threads. e.g.
When there're 4 threads trying to acquire the spinlock, the most unlucky thread
needs to wait with IRQs disabled for 3x18us=54us in the worst case.

> > For your reference, I measured some test data by concurrently launching and
> > destroying 4 TDs for 3 rounds:
> > 
> >                                t0 ---------------------
> > scoped_guard(spinlock, &pamt_lock) {       |->T1=t1-t0 |
> >                                t1 ----------           |
> >  ...                                                   |
> >                                t2 ----------           |->T3=t4-t0
> >  tdh_phymem_pamt_add/remove()              |->T2=t3-t2 |
> >                                t3 ----------           |
> >  ...                                                   |
> >                                t4 ---------------------
> > }
> > 
> > (1) for __tdx_pamt_get()
> > 
> >        avg us   min us   max us
> > ------|---------------------------
> >   T1  |   4       0       69
> >   T2  |   4       2       18
> >   T3  |  10       3       83
> > 
> > 
> > (2) for__tdx_pamt_put()
> > 
> >        avg us   min us   max us
> > ------|---------------------------
> >   T1  |   0        0       5
> >   T2  |   2        1      11
> >   T3  |   3        2      15
> > 
> >  
> > > Option #2 would be to immediately free the page in tdx_sept_reclaim_private_sp(),
> > > so that pages that freed via handle_removed_pt() don't defer freeing the S-EPT
> > > page table (which, IIUC, is safe since the TDX-Module forces TLB flushes and exits).
> > > 
> > > I really, really don't like this option (if it even works).
> > I don't like its asymmetry with tdx_sept_link_private_spt().
> > 
> > However, do you think it would be good to have the PAMT pages of the sept pages
> > allocated from (*topup_private_mapping_cache) [1]?
> 
> Hrm, dunno about "good", but it's definitely not terrible.  To get the cache
> management right, it means adding yet another use of kvm_get_running_vcpu(), which
> I really dislike.
> 
> On the other hand, if we combine that with TDX freeing in-use S-EPT page tables,
> unless I'm overly simplifying things, it would avoid having to extend
> kvm_mmu_memory_cache with the page_{get,free}() hook, and would then eliminate
> two kvm_x86_ops hooks, because the alloc/free of _unused_ S-EPT page tables is
> no different than regular page tables.
> 
> As a bonus, we could keep the topup_external_cache() name and just clarify that
> the parameter specifies the number of page table pages, i.e. account for the +1
> for the mapping page in TDX code.
> 
> All in all, I'm kinda leaning in this direction, because as much as I dislike
> kvm_get_running_vcpu(), it does minimize the number of kvm_x86_ops hooks.
> 
> Something like this?  Also pushed to 
> 
>   https://github.com/sean-jc/linux.git x86/tdx_huge_sept_alt
> 
It lacks the following change in tdx_sept_split_private_spte().

@@ -1836,46 +1841,70 @@ static int tdx_sept_split_private_spte(struct kvm *kvm, gfn_t gfn, u64 old_spte,
        if (!pamt_cache)
                return -EIO;

+       r = tdx_pamt_get(page_to_pfn(external_spt), PG_LEVEL_4K, pamt_cache);
+       if (r)
+               return r;
+
        err = tdh_do_no_vcpus(tdh_mem_range_block, kvm, &kvm_tdx->td, gpa,
                              level, &entry, &level_state);
-       if (TDX_BUG_ON_2(err, TDH_MEM_RANGE_BLOCK, entry, level_state, kvm))
-               return -EIO;
+       if (TDX_BUG_ON_2(err, TDH_MEM_RANGE_BLOCK, entry, level_state, kvm)) {
+               r = -EIO;
+               goto err;
+       }

        tdx_track(kvm);

        err = tdh_do_no_vcpus(tdh_mem_page_demote, kvm, &kvm_tdx->td, gpa,
                              level, spte_to_pfn(old_spte), external_spt,
                              pamt_cache, &entry, &level_state);
-       if (TDX_BUG_ON_2(err, TDH_MEM_PAGE_DEMOTE, entry, level_state, kvm))
-               return -EIO;
+       if (TDX_BUG_ON_2(err, TDH_MEM_PAGE_DEMOTE, entry, level_state, kvm)) {
+               r = -EIO;
+               goto err;
+       }

        return 0;
+err:
+       tdx_pamt_put(page_to_pfn(external_spt), PG_LEVEL_4K);
+       return r;
 }


Otherwise, LGTM except for the nits below.

> ---
>  arch/x86/include/asm/kvm-x86-ops.h |  6 +--
>  arch/x86/include/asm/kvm_host.h    | 15 ++------
>  arch/x86/kvm/mmu/mmu.c             |  3 --
>  arch/x86/kvm/mmu/tdp_mmu.c         | 23 +++++++-----
>  arch/x86/kvm/vmx/tdx.c             | 60 ++++++++++++++++++++----------
>  5 files changed, 61 insertions(+), 46 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index 6083fb07cd3b..4b865617a421 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -94,11 +94,9 @@ KVM_X86_OP_OPTIONAL_RET0(set_tss_addr)
>  KVM_X86_OP_OPTIONAL_RET0(set_identity_map_addr)
>  KVM_X86_OP_OPTIONAL_RET0(get_mt_mask)
>  KVM_X86_OP(load_mmu_pgd)
> -KVM_X86_OP_OPTIONAL(alloc_external_sp)
> -KVM_X86_OP_OPTIONAL(free_external_sp)
> -KVM_X86_OP_OPTIONAL_RET0(set_external_spte)
> -KVM_X86_OP_OPTIONAL(reclaim_external_sp)
> +KVM_X86_OP_OPTIONAL(reclaim_external_spt)
>  KVM_X86_OP_OPTIONAL_RET0(topup_external_cache)
> +KVM_X86_OP_OPTIONAL_RET0(set_external_spte)
>  KVM_X86_OP(has_wbinvd_exit)
>  KVM_X86_OP(get_l2_tsc_offset)
>  KVM_X86_OP(get_l2_tsc_multiplier)
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index cd3e7dc6ab9b..d3c31eaf18b1 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1850,19 +1850,12 @@ struct kvm_x86_ops {
>  	void (*load_mmu_pgd)(struct kvm_vcpu *vcpu, hpa_t root_hpa,
>  			     int root_level);
>  
> -	/*
> -	 * Callbacks to allocate and free external page tables, a.k.a. S-EPT,
> -	 * and to propagate changes in mirror page tables to the external page
> -	 * tables.
> -	 */
> -	unsigned long (*alloc_external_sp)(gfp_t gfp);
> -	void (*free_external_sp)(unsigned long addr);
> +	void (*reclaim_external_spt)(struct kvm *kvm, gfn_t gfn,
> +				     struct kvm_mmu_page *sp);
> +	int (*topup_external_cache)(struct kvm *kvm, struct kvm_vcpu *vcpu,
> +				    int min_nr_spts);
>  	int (*set_external_spte)(struct kvm *kvm, gfn_t gfn, u64 old_spte,
>  				 u64 new_spte, enum pg_level level);
> -	void (*reclaim_external_sp)(struct kvm *kvm, gfn_t gfn,
> -				    struct kvm_mmu_page *sp);
> -	int (*topup_external_cache)(struct kvm *kvm, struct kvm_vcpu *vcpu, int min);
> -
>  
>  	bool (*has_wbinvd_exit)(void);
>  
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 62bf6bec2df2..f7cf456d9404 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -6714,9 +6714,6 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
>  	if (!vcpu->arch.mmu_shadow_page_cache.init_value)
>  		vcpu->arch.mmu_shadow_page_cache.gfp_zero = __GFP_ZERO;
>  
> -	vcpu->arch.mmu_external_spt_cache.page_get = kvm_x86_ops.alloc_external_sp;
> -	vcpu->arch.mmu_external_spt_cache.page_free = kvm_x86_ops.free_external_sp;
> -
>  	vcpu->arch.mmu = &vcpu->arch.root_mmu;
>  	vcpu->arch.walk_mmu = &vcpu->arch.root_mmu;
>  
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index fef856323821..732548a678d8 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -53,14 +53,18 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm)
>  	rcu_barrier();
>  }
>  
> -static void tdp_mmu_free_sp(struct kvm_mmu_page *sp)
> +static void __tdp_mmu_free_sp(struct kvm_mmu_page *sp)
>  {
> -	if (sp->external_spt)
> -		kvm_x86_call(free_external_sp)((unsigned long)sp->external_spt);
>  	free_page((unsigned long)sp->spt);
>  	kmem_cache_free(mmu_page_header_cache, sp);
>  }
>  
> +static void tdp_mmu_free_unused_sp(struct kvm_mmu_page *sp)
> +{
> +	free_page((unsigned long)sp->external_spt);
> +	__tdp_mmu_free_sp(sp);
> +}
> +
>  /*
>   * This is called through call_rcu in order to free TDP page table memory
>   * safely with respect to other kernel threads that may be operating on
> @@ -74,7 +78,8 @@ static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head)
>  	struct kvm_mmu_page *sp = container_of(head, struct kvm_mmu_page,
>  					       rcu_head);
>  
> -	tdp_mmu_free_sp(sp);
> +	WARN_ON_ONCE(sp->external_spt);
> +	__tdp_mmu_free_sp(sp);
>  }
>  
>  void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root)
> @@ -458,7 +463,7 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
>  	}
>  
>  	if (is_mirror_sp(sp))
> -		kvm_x86_call(reclaim_external_sp)(kvm, base_gfn, sp);
> +		kvm_x86_call(reclaim_external_spt)(kvm, base_gfn, sp);
>  
>  	call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback);
>  }
> @@ -1266,7 +1271,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
>  		 * failed, e.g. because a different task modified the SPTE.
>  		 */
>  		if (r) {
> -			tdp_mmu_free_sp(sp);
> +			tdp_mmu_free_unused_sp(sp);
>  			goto retry;
>  		}
>  
> @@ -1461,7 +1466,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_split(struct kvm *kvm,
>  		goto err_spt;
>  
>  	if (is_mirror_sp) {
> -		sp->external_spt = (void *)kvm_x86_call(alloc_external_sp)(GFP_KERNEL_ACCOUNT);
> +		sp->external_spt = (void *)__get_free_page(GFP_KERNEL_ACCOUNT);
>  		if (!sp->external_spt)
>  			goto err_external_spt;
>  
> @@ -1472,7 +1477,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_split(struct kvm *kvm,
>  	return sp;
>  
>  err_external_split:
> -	kvm_x86_call(free_external_sp)((unsigned long)sp->external_spt);
> +	free_page((unsigned long)sp->external_spt);
>  err_external_spt:
>  	free_page((unsigned long)sp->spt);
>  err_spt:
> @@ -1594,7 +1599,7 @@ static int tdp_mmu_split_huge_pages_root(struct kvm *kvm,
>  	 * installs its own sp in place of the last sp we tried to split.
>  	 */
>  	if (sp)
> -		tdp_mmu_free_sp(sp);
> +		tdp_mmu_free_unused_sp(sp);
>  
>  	return 0;
>  }
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index ae7b9beb3249..b0fc17baa1fc 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -1701,7 +1701,7 @@ static struct tdx_pamt_cache *tdx_get_pamt_cache(struct kvm *kvm,
>  }
>  
>  static int tdx_topup_external_pamt_cache(struct kvm *kvm,
> -					 struct kvm_vcpu *vcpu, int min)
> +					 struct kvm_vcpu *vcpu, int min_nr_spts)
>  {
>  	struct tdx_pamt_cache *pamt_cache;
>  
> @@ -1712,7 +1712,11 @@ static int tdx_topup_external_pamt_cache(struct kvm *kvm,
>  	if (!pamt_cache)
>  		return -EIO;
>  
> -	return tdx_topup_pamt_cache(pamt_cache, min);
> +	/*
> +	 * Each S-EPT page tables requires a DPAMT pair, plus one more for the
> +	 * memory being mapped into the guest.
> +	 */
> +	return tdx_topup_pamt_cache(pamt_cache, min_nr_spts + 1);
Nit:
S-EPT root page is a control page and it has no corresponding sp->external_spt.

So, do you think it would be good to check the root level?

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index ae8b8438ae99..fff05052de27 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1643,16 +1643,23 @@ static struct tdx_pamt_cache *tdx_get_pamt_cache(struct kvm *kvm,
 static int tdx_topup_external_pamt_cache(struct kvm *kvm,
                                         struct kvm_vcpu *vcpu, int min_nr_spts)
 {
+       int root_level = (kvm_gfn_direct_bits(kvm) == TDX_SHARED_BIT_PWL_5) ? 5 :4;
        struct tdx_pamt_cache *pamt_cache;

        if (!tdx_supports_dynamic_pamt(tdx_sysinfo))
                return 0;

        pamt_cache = tdx_get_pamt_cache(kvm, vcpu);
        if (!pamt_cache)
                return -EIO;

+       /*
+        * S-EPT root page is one of tdcs_pages whose PAMT pages have been installed in
+        * __tdx_td_init().
+        */
+       if (min_nr_spts == root_level)
+               min_nr_spts--;
+
        /*
         * Each S-EPT page tables requires a DPAMT pair, plus one more for the
         * memory being mapped into the guest.


>  }
>  
>  static int tdx_mem_page_add(struct kvm *kvm, gfn_t gfn, enum pg_level level,
> @@ -1911,23 +1915,41 @@ static int tdx_sept_split_private_spte(struct kvm *kvm, gfn_t gfn, u64 old_spte,
>  static int tdx_sept_link_private_spt(struct kvm *kvm, gfn_t gfn, u64 new_spte,
>  				     enum pg_level level)
>  {
> +	struct tdx_pamt_cache *pamt_cache;
>  	gpa_t gpa = gfn_to_gpa(gfn);
>  	u64 err, entry, level_state;
>  	struct page *external_spt;
> +	int r;
>  
>  	external_spt = tdx_spte_to_external_spt(kvm, gfn, new_spte, level);
>  	if (!external_spt)
>  		return -EIO;
>  
> +	pamt_cache = tdx_get_pamt_cache(kvm, kvm_get_running_vcpu());
> +	if (!pamt_cache)
> +		return -EIO;
> +
> +	r = tdx_pamt_get(page_to_pfn(external_spt), PG_LEVEL_4K, pamt_cache);
> +	if (r)
> +		return r;
> +
>  	err = tdh_mem_sept_add(&to_kvm_tdx(kvm)->td, gpa, level, external_spt,
>  			       &entry, &level_state);
> -	if (unlikely(IS_TDX_OPERAND_BUSY(err)))
> -		return -EBUSY;
> +	if (unlikely(IS_TDX_OPERAND_BUSY(err))) {
> +		r = -EBUSY;
> +		goto err;
> +	}
>  
> -	if (TDX_BUG_ON_2(err, TDH_MEM_SEPT_ADD, entry, level_state, kvm))
> -		return -EIO;
> +	if (TDX_BUG_ON_2(err, TDH_MEM_SEPT_ADD, entry, level_state, kvm)) {
> +		r = -EIO;
> +		goto err;
> +	}
>  
>  	return 0;
> +
> +err:
> +	tdx_pamt_put(page_to_pfn(external_spt), PG_LEVEL_4K);
> +	return r;
>  }
>  
>  static int tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn,
> @@ -1995,8 +2017,8 @@ static int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, u64 old_spte,
>  	return tdx_sept_map_leaf_spte(kvm, gfn, new_spte, level);
>  }
>  
> -static void tdx_sept_reclaim_private_sp(struct kvm *kvm, gfn_t gfn,
> -					struct kvm_mmu_page *sp)
> +static void tdx_sept_reclaim_private_spt(struct kvm *kvm, gfn_t gfn,
> +					 struct kvm_mmu_page *sp)
>  {
>  	/*
>  	 * KVM doesn't (yet) zap page table pages in mirror page table while
> @@ -2014,7 +2036,16 @@ static void tdx_sept_reclaim_private_sp(struct kvm *kvm, gfn_t gfn,
>  	 */
>  	if (KVM_BUG_ON(is_hkid_assigned(to_kvm_tdx(kvm)), kvm) ||
>  	    tdx_reclaim_page(virt_to_page(sp->external_spt)))
> -		sp->external_spt = NULL;
> +		goto out;
> +
> +	/*
> +	 * Immediately free the S-EPT page as the TDX subsystem doesn't support
> +	 * freeing pages from RCU callbacks, and more importantly because
> +	 * TDH.PHYMEM.PAGE.RECLAIM ensures there are no outstanding readers.
> +	 */
> +	tdx_free_control_page((unsigned long)sp->external_spt);
This creates another asymmetry, where there's nowhere to invoke
tdx_alloc_control_page() for the sp->external_spt.

Calling tdx_free_control_page() here could be confusing because:
- tdx_sept_reclaim_private_spt() is called only for non-root sps, whose
  sp->external_spt is not allocated via tdx_alloc_control_page().
- The S-EPT root page is allocated via __tdx_alloc_control_page() by
  __tdx_td_init(), but has no corresponding sp->external_spt.

So, could we just invoke 
"__tdx_pamt_put(page_to_pfn(virt_to_page(sp->external_spt)))" in 
tdx_sept_reclaim_private_sp()?

After tdx_sept_reclaim_private_spt() returns, sp goes back to unused by the
external page table. So, TDP MMU can invoke tdp_mmu_free_sp() without needing to
differentiate whether it's unused or not.

Something like below?

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 732548a678d8..d621e94d73c2 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -53,18 +53,15 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm)
        rcu_barrier();
 }

-static void __tdp_mmu_free_sp(struct kvm_mmu_page *sp)
+static void tdp_mmu_free_sp(struct kvm_mmu_page *sp)
 {
+       free_page((unsigned long)sp->external_spt);
        free_page((unsigned long)sp->spt);
        kmem_cache_free(mmu_page_header_cache, sp);
 }

-static void tdp_mmu_free_unused_sp(struct kvm_mmu_page *sp)
-{
-       free_page((unsigned long)sp->external_spt);
-       __tdp_mmu_free_sp(sp);
-}
-
 /*
  * This is called through call_rcu in order to free TDP page table memory
  * safely with respect to other kernel threads that may be operating on
@@ -78,8 +75,7 @@ static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head)
        struct kvm_mmu_page *sp = container_of(head, struct kvm_mmu_page,
                                               rcu_head);

-       WARN_ON_ONCE(sp->external_spt);
-       __tdp_mmu_free_sp(sp);
+       tdp_mmu_free_sp(sp);
 }

 void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root)
@@ -1271,7 +1267,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
                 * failed, e.g. because a different task modified the SPTE.
                 */
                if (r) {
-                       tdp_mmu_free_unused_sp(sp);
+                       tdp_mmu_free_sp(sp);
                        goto retry;
                }

@@ -1599,7 +1595,7 @@ static int tdp_mmu_split_huge_pages_root(struct kvm *kvm,
         * installs its own sp in place of the last sp we tried to split.
         */
        if (sp)
-               tdp_mmu_free_unused_sp(sp);
+               tdp_mmu_free_sp(sp);

        return 0;
 }
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index b0fc17baa1fc..fbaf43b8cd46 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -2035,17 +2035,12 @@ static void tdx_sept_reclaim_private_spt(struct kvm *kvm, gfn_t gfn,
         * removal of the still-used PAMT entry.
         */
        if (KVM_BUG_ON(is_hkid_assigned(to_kvm_tdx(kvm)), kvm) ||
-           tdx_reclaim_page(virt_to_page(sp->external_spt)))
-               goto out;
+           tdx_reclaim_page(virt_to_page(sp->external_spt))) {
+               sp->external_spt = NULL;
+               return;
+       }

-       /*
-        * Immediately free the S-EPT page as the TDX subsystem doesn't support
-        * freeing pages from RCU callbacks, and more importantly because
-        * TDH.PHYMEM.PAGE.RECLAIM ensures there are no outstanding readers.
-        */
-       tdx_free_control_page((unsigned long)sp->external_spt);
-out:
-       sp->external_spt = NULL;
+       __tdx_pamt_put(page_to_pfn(virt_to_page(sp->external_spt)));
 }

 void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ