lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aYMMHVvwDjZ7Lz9l@yzhao56-desk.sh.intel.com>
Date: Wed, 4 Feb 2026 17:06:37 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
CC: Thomas Gleixner <tglx@...nel.org>, Ingo Molnar <mingo@...hat.com>,
	Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
	<x86@...nel.org>, Kiryl Shutsemau <kas@...nel.org>, Paolo Bonzini
	<pbonzini@...hat.com>, <linux-kernel@...r.kernel.org>,
	<linux-coco@...ts.linux.dev>, <kvm@...r.kernel.org>, Kai Huang
	<kai.huang@...el.com>, Rick Edgecombe <rick.p.edgecombe@...el.com>, "Vishal
 Annapurve" <vannapurve@...gle.com>, Ackerley Tng <ackerleytng@...gle.com>,
	Sagi Shahar <sagis@...gle.com>, Binbin Wu <binbin.wu@...ux.intel.com>,
	Xiaoyao Li <xiaoyao.li@...el.com>, Isaku Yamahata <isaku.yamahata@...el.com>
Subject: Re: [RFC PATCH v5 08/45] KVM: x86/mmu: Propagate mirror SPTE removal
 to S-EPT in handle_changed_spte()

On Wed, Jan 28, 2026 at 05:14:40PM -0800, Sean Christopherson wrote:
> Invoke .remove_external_spte() in handle_changed_spte() as appropriate
> instead of relying on callers to do the right thing.  Relying on callers
> to invoke .remove_external_spte() is confusing and brittle, e.g. subtly
> relies tdp_mmu_set_spte_atomic() never removing SPTEs, and removing an
> S-EPT entry in tdp_mmu_set_spte() is bizarre (yeah, the VM is bugged so
> it doesn't matter in practice, but it's still weird).
> 
> Implementing rules-based logic in a common chokepoint will also make it
> easier to reason about the correctness of splitting hugepages when support
> for S-EPT hugepages comes along.
> 
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 43 +++++++++++++-------------------------
>  1 file changed, 14 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 8743cd020d12..27ac520f2a89 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -359,25 +359,6 @@ static void tdp_mmu_unlink_sp(struct kvm *kvm, struct kvm_mmu_page *sp)
>  	spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
>  }
>  
> -static void remove_external_spte(struct kvm *kvm, gfn_t gfn, u64 old_spte,
> -				 int level)
> -{
> -	/*
> -	 * External (TDX) SPTEs are limited to PG_LEVEL_4K, and external
> -	 * PTs are removed in a special order, involving free_external_spt().
> -	 * But remove_external_spte() will be called on non-leaf PTEs via
> -	 * __tdp_mmu_zap_root(), so avoid the error the former would return
> -	 * in this case.
> -	 */
> -	if (!is_last_spte(old_spte, level))
> -		return;
> -
> -	/* Zapping leaf spte is allowed only when write lock is held. */
> -	lockdep_assert_held_write(&kvm->mmu_lock);
> -
> -	kvm_x86_call(remove_external_spte)(kvm, gfn, level, old_spte);
> -}
> -
>  /**
>   * handle_removed_pt() - handle a page table removed from the TDP structure
>   *
> @@ -473,11 +454,6 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
>  		}
>  		handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), sptep, gfn,
>  				    old_spte, FROZEN_SPTE, level, shared);
> -
> -		if (is_mirror_sp(sp)) {
> -			KVM_BUG_ON(shared, kvm);
> -			remove_external_spte(kvm, gfn, old_spte, level);
> -		}
>  	}
>  
>  	if (is_mirror_sp(sp) &&
> @@ -590,10 +566,21 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
>  	 * the paging structure.  Note the WARN on the PFN changing without the
>  	 * SPTE being converted to a hugepage (leaf) or being zapped.  Shadow
>  	 * pages are kernel allocations and should never be migrated.
> +	 *
> +	 * When removing leaf entries from a mirror, immediately propagate the
> +	 * changes to the external page tables.  Note, non-leaf mirror entries
> +	 * are handled by handle_removed_pt(), as TDX requires that all leaf
> +	 * entries are removed before the owning page table.  Note #2, writes
> +	 * to make mirror PTEs shadow-present are propagated to external page
> +	 * tables by __tdp_mmu_set_spte_atomic(), as KVM needs to ensure the
> +	 * external page table was successfully updated before marking the
> +	 * mirror SPTE present.
>  	 */
>  	if (was_present && !was_leaf &&
>  	    (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed)))
>  		handle_removed_pt(kvm, spte_to_child_pt(old_spte, level), shared);
> +	else if (was_leaf && is_mirror_sptep(sptep) && !is_leaf)
Should we check !is_present instead of !is_leaf?
e.g. a transition from a present leaf entry to a present non-leaf entry could
also trigger this if case.

Besides, need "KVM_BUG_ON(shared, kvm)" in this case.
> +		kvm_x86_call(remove_external_spte)(kvm, gfn, level, old_spte);
>  }
>  
>  static inline int __must_check __tdp_mmu_set_spte_atomic(struct kvm *kvm,
> @@ -725,12 +712,10 @@ static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
>  
>  	/*
>  	 * Users that do non-atomic setting of PTEs don't operate on mirror
> -	 * roots, so don't handle it and bug the VM if it's seen.
> +	 * roots.  Bug the VM as this path doesn't propagate such writes to the
> +	 * external page tables.
>  	 */
> -	if (is_mirror_sptep(sptep)) {
> -		KVM_BUG_ON(is_shadow_present_pte(new_spte), kvm);
> -		remove_external_spte(kvm, gfn, old_spte, level);
> -	}
> +	KVM_BUG_ON(is_mirror_sptep(sptep) && is_shadow_present_pte(new_spte), kvm);
>  
>  	return old_spte;
>  }
> -- 
> 2.53.0.rc1.217.geba53bf80e-goog
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ