[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aYL3izez+eZ34G/3@yzhao56-desk.sh.intel.com>
Date: Wed, 4 Feb 2026 15:38:51 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
CC: Thomas Gleixner <tglx@...nel.org>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
<x86@...nel.org>, Kiryl Shutsemau <kas@...nel.org>, Paolo Bonzini
<pbonzini@...hat.com>, <linux-kernel@...r.kernel.org>,
<linux-coco@...ts.linux.dev>, <kvm@...r.kernel.org>, Kai Huang
<kai.huang@...el.com>, Rick Edgecombe <rick.p.edgecombe@...el.com>, "Vishal
Annapurve" <vannapurve@...gle.com>, Ackerley Tng <ackerleytng@...gle.com>,
Sagi Shahar <sagis@...gle.com>, Binbin Wu <binbin.wu@...ux.intel.com>,
Xiaoyao Li <xiaoyao.li@...el.com>, Isaku Yamahata <isaku.yamahata@...el.com>
Subject: Re: [RFC PATCH v5 06/45] KVM: x86/mmu: Fold
set_external_spte_present() into its sole caller
On Wed, Jan 28, 2026 at 05:14:38PM -0800, Sean Christopherson wrote:
> Fold set_external_spte_present() into __tdp_mmu_set_spte_atomic() in
> anticipation of supporting hugepage splitting, at which point other paths
> will also set shadow-present external SPTEs.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
> arch/x86/kvm/mmu/tdp_mmu.c | 82 +++++++++++++++++---------------------
> 1 file changed, 36 insertions(+), 46 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 56ad056e6042..6fb48b217f5b 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -495,33 +495,6 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
> call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback);
> }
>
> -static int __must_check set_external_spte_present(struct kvm *kvm, tdp_ptep_t sptep,
> - gfn_t gfn, u64 *old_spte,
> - u64 new_spte, int level)
> -{
> - int ret;
> -
> - lockdep_assert_held(&kvm->mmu_lock);
> -
> - if (KVM_BUG_ON(is_shadow_present_pte(*old_spte), kvm))
> - return -EIO;
> -
> - /*
> - * We need to lock out other updates to the SPTE until the external
> - * page table has been modified. Use FROZEN_SPTE similar to
> - * the zapping case.
> - */
> - if (!try_cmpxchg64(rcu_dereference(sptep), old_spte, FROZEN_SPTE))
> - return -EBUSY;
> -
> - ret = kvm_x86_call(set_external_spte)(kvm, gfn, level, new_spte);
> - if (ret)
> - __kvm_tdp_mmu_write_spte(sptep, *old_spte);
> - else
> - __kvm_tdp_mmu_write_spte(sptep, new_spte);
> - return ret;
> -}
> -
> /**
> * handle_changed_spte - handle bookkeeping associated with an SPTE change
> * @kvm: kvm instance
> @@ -626,6 +599,8 @@ static inline int __must_check __tdp_mmu_set_spte_atomic(struct kvm *kvm,
> struct tdp_iter *iter,
> u64 new_spte)
> {
> + u64 *raw_sptep = rcu_dereference(iter->sptep);
> +
> /*
> * The caller is responsible for ensuring the old SPTE is not a FROZEN
> * SPTE. KVM should never attempt to zap or manipulate a FROZEN SPTE,
> @@ -638,31 +613,46 @@ static inline int __must_check __tdp_mmu_set_spte_atomic(struct kvm *kvm,
> int ret;
>
> /*
> - * Users of atomic zapping don't operate on mirror roots,
> - * so don't handle it and bug the VM if it's seen.
> + * KVM doesn't currently support zapping or splitting mirror
> + * SPTEs while holding mmu_lock for read.
> */
> - if (KVM_BUG_ON(!is_shadow_present_pte(new_spte), kvm))
> + if (KVM_BUG_ON(is_shadow_present_pte(iter->old_spte), kvm) ||
> + KVM_BUG_ON(!is_shadow_present_pte(new_spte), kvm))
> return -EBUSY;
Should this be -EIO instead?
Though -EBUSY was introduced by commit 94faba8999b9 ('KVM: x86/tdp_mmu:
Propagate tearing down mirror page tables')
> - ret = set_external_spte_present(kvm, iter->sptep, iter->gfn,
> - &iter->old_spte, new_spte, iter->level);
Add "lockdep_assert_held(&kvm->mmu_lock)" for this case?
> + /*
> + * Temporarily freeze the SPTE until the external PTE operation
> + * has completed, e.g. so that concurrent faults don't attempt
> + * to install a child PTE in the external page table before the
> + * parent PTE has been written.
> + */
> + if (!try_cmpxchg64(raw_sptep, &iter->old_spte, FROZEN_SPTE))
> + return -EBUSY;
> +
> + /*
> + * Update the external PTE. On success, set the mirror SPTE to
> + * the desired value. On failure, restore the old SPTE so that
> + * the SPTE isn't frozen in perpetuity.
> + */
> + ret = kvm_x86_call(set_external_spte)(kvm, iter->gfn,
> + iter->level, new_spte);
> if (ret)
> - return ret;
> - } else {
> - u64 *sptep = rcu_dereference(iter->sptep);
> -
> - /*
> - * Note, fast_pf_fix_direct_spte() can also modify TDP MMU SPTEs
> - * and does not hold the mmu_lock. On failure, i.e. if a
> - * different logical CPU modified the SPTE, try_cmpxchg64()
> - * updates iter->old_spte with the current value, so the caller
> - * operates on fresh data, e.g. if it retries
> - * tdp_mmu_set_spte_atomic()
> - */
> - if (!try_cmpxchg64(sptep, &iter->old_spte, new_spte))
> - return -EBUSY;
> + __kvm_tdp_mmu_write_spte(iter->sptep, iter->old_spte);
> + else
> + __kvm_tdp_mmu_write_spte(iter->sptep, new_spte);
> + return ret;
> }
>
> + /*
> + * Note, fast_pf_fix_direct_spte() can also modify TDP MMU SPTEs and
> + * does not hold the mmu_lock. On failure, i.e. if a different logical
> + * CPU modified the SPTE, try_cmpxchg64() updates iter->old_spte with
> + * the current value, so the caller operates on fresh data, e.g. if it
> + * retries tdp_mmu_set_spte_atomic()
> + */
> + if (!try_cmpxchg64(raw_sptep, &iter->old_spte, new_spte))
> + return -EBUSY;
> +
> return 0;
> }
>
> --
> 2.53.0.rc1.217.geba53bf80e-goog
>
Powered by blists - more mailing lists