linux-kernel - Re: [PATCH v5 3/7] KVM: x86/mmu: Recover TDP MMU NX huge pages using MMU read lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aIft7sUk_w8rV2DB@google.com>
Date: Mon, 28 Jul 2025 14:38:54 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: David Matlack <dmatlack@...gle.com>
Cc: James Houghton <jthoughton@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>, 
	Vipin Sharma <vipinsh@...gle.com>, kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 3/7] KVM: x86/mmu: Recover TDP MMU NX huge pages using
 MMU read lock

On Mon, Jul 28, 2025, David Matlack wrote:
> On Mon, Jul 28, 2025 at 11:08 AM James Houghton <jthoughton@...gle.com> wrote:
> > On Wed, Jul 23, 2025 at 1:35 PM Sean Christopherson <seanjc@...gle.com> wrote:
> > > > @@ -7559,8 +7590,17 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm,
> > > >       rcu_read_lock();
> > > >
> > > >       for ( ; to_zap; --to_zap) {
> > > > -             if (list_empty(nx_huge_pages))
> > > > +#ifdef CONFIG_X86_64
> > >
> > > These #ifdefs still make me sad, but I also still think they're the least awful
> > > solution.  And hopefully we will jettison 32-bit sooner than later :-)
> >
> > Yeah I couldn't come up with anything better. :(
> 
> Could we just move the definition of tdp_mmu_pages_lock outside of
> CONFIG_X86_64? The only downside I can think of is slightly larger kvm
> structs for 32-bit builds.

Hmm, I was going to say "no, because we'd also need to do spin_lock_init()", but
obviously spin_(un)lock() will only ever be invoked for 64-bit kernels.  I still
don't love the idea of making tdp_mmu_pages_lock visible outside of CONFIG_X86_64,
it feels like we're just asking to introduce (likely benign) bugs.

Ugh, and I just noticed this as well:

  #ifndef CONFIG_X86_64
  #define KVM_TDP_MMU -1
  #endif

Rather than expose kvm->arch.tdp_mmu_pages_lock, what about using a single #ifdef
section to bury both is_tdp_mmu and a local kvm->arch.tdp_mmu_pages_lock pointer?

Alternatively, we could do:

	const bool is_tdp_mmu = IS_ENABLED(CONFIG_X86_64) && mmu_type != KVM_SHADOW_MMU;

to avoid referencing KVM_TDP_MMU, but that's quite ugly.  Overall, I think the
below strikes the best balance between polluting the code with #ifdefs, and
generating robust code.

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 52bf6a886bfd..c038d7cd187d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1372,10 +1372,6 @@ enum kvm_mmu_type {
        KVM_NR_MMU_TYPES,
 };
 
-#ifndef CONFIG_X86_64
-#define KVM_TDP_MMU -1
-#endif
-
 struct kvm_arch {
        unsigned long n_used_mmu_pages;
        unsigned long n_requested_mmu_pages;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a6a1fb42b2d1..e2bde6a5e346 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -7624,8 +7624,14 @@ static bool kvm_mmu_sp_dirty_logging_enabled(struct kvm *kvm,
 static void kvm_recover_nx_huge_pages(struct kvm *kvm,
                                      const enum kvm_mmu_type mmu_type)
 {
+#ifdef CONFIG_X86_64
+       const bool is_tdp_mmu = mmu_type == KVM_TDP_MMU;
+       spinlock_t *tdp_mmu_pages_lock = &kvm->arch.tdp_mmu_pages_lock;
+#else
+       const bool is_tdp_mmu = false;
+       spinlock_t *tdp_mmu_pages_lock = NULL;
+#endif
        unsigned long to_zap = nx_huge_pages_to_zap(kvm, mmu_type);
-       bool is_tdp_mmu = mmu_type == KVM_TDP_MMU;
        struct list_head *nx_huge_pages;
        struct kvm_mmu_page *sp;
        LIST_HEAD(invalid_list);
@@ -7648,15 +7654,12 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm,
        rcu_read_lock();
 
        for ( ; to_zap; --to_zap) {
-#ifdef CONFIG_X86_64
                if (is_tdp_mmu)
-                       spin_lock(&kvm->arch.tdp_mmu_pages_lock);
-#endif
+                       spin_lock(tdp_mmu_pages_lock);
+
                if (list_empty(nx_huge_pages)) {
-#ifdef CONFIG_X86_64
                        if (is_tdp_mmu)
-                               spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
-#endif
+                               spin_unlock(tdp_mmu_pages_lock);
                        break;
                }
 
@@ -7675,10 +7678,8 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm,
 
                unaccount_nx_huge_page(kvm, sp);
 
-#ifdef CONFIG_X86_64
                if (is_tdp_mmu)
-                       spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
-#endif
+                       spin_unlock(tdp_mmu_pages_lock);
 
                /*
                 * Do not attempt to recover any NX Huge Pages that are being
--