linux-kernel - Re: [PATCH v5 3/7] KVM: x86/mmu: Recover TDP MMU NX huge pages using MMU read lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALzav=cy8SoVs1N7sCbtqv5b5smGFQi0JNwOdwrQkkv0wMrz8g@mail.gmail.com>
Date: Tue, 12 Aug 2025 12:21:15 -0700
From: David Matlack <dmatlack@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: James Houghton <jthoughton@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>, 
	Vipin Sharma <vipinsh@...gle.com>, kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 3/7] KVM: x86/mmu: Recover TDP MMU NX huge pages using
 MMU read lock

On Fri, Aug 1, 2025 at 3:00 PM Sean Christopherson <seanjc@...gle.com> wrote:
>
> On Fri, Aug 01, 2025, David Matlack wrote:
> > On Mon, Jul 28, 2025 at 2:49 PM James Houghton <jthoughton@...gle.com> wrote:
> > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > > index a6a1fb42b2d1..e2bde6a5e346 100644
> > > > --- a/arch/x86/kvm/mmu/mmu.c
> > > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > > @@ -7624,8 +7624,14 @@ static bool kvm_mmu_sp_dirty_logging_enabled(struct kvm *kvm,
> > > >  static void kvm_recover_nx_huge_pages(struct kvm *kvm,
> > > >                                       const enum kvm_mmu_type mmu_type)
> > > >  {
> > > > +#ifdef CONFIG_X86_64
> > > > +       const bool is_tdp_mmu = mmu_type == KVM_TDP_MMU;
> > > > +       spinlock_t *tdp_mmu_pages_lock = &kvm->arch.tdp_mmu_pages_lock;
> > > > +#else
> > > > +       const bool is_tdp_mmu = false;
> > > > +       spinlock_t *tdp_mmu_pages_lock = NULL;
> > > > +#endif
> > > >         unsigned long to_zap = nx_huge_pages_to_zap(kvm, mmu_type);
> > > > -       bool is_tdp_mmu = mmu_type == KVM_TDP_MMU;
> > > >         struct list_head *nx_huge_pages;
> > > >         struct kvm_mmu_page *sp;
> > > >         LIST_HEAD(invalid_list);
> > > > @@ -7648,15 +7654,12 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm,
> > > >         rcu_read_lock();
> > > >
> > > >         for ( ; to_zap; --to_zap) {
> > > > -#ifdef CONFIG_X86_64
> > > >                 if (is_tdp_mmu)
> > > > -                       spin_lock(&kvm->arch.tdp_mmu_pages_lock);
> > > > -#endif
> > > > +                       spin_lock(tdp_mmu_pages_lock);
> > > > +
> > > >                 if (list_empty(nx_huge_pages)) {
> > > > -#ifdef CONFIG_X86_64
> > > >                         if (is_tdp_mmu)
> > > > -                               spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
> > > > -#endif
> > > > +                               spin_unlock(tdp_mmu_pages_lock);
> > > >                         break;
> > > >                 }
> > > >
> > > > @@ -7675,10 +7678,8 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm,
> > > >
> > > >                 unaccount_nx_huge_page(kvm, sp);
> > > >
> > > > -#ifdef CONFIG_X86_64
> > > >                 if (is_tdp_mmu)
> > > > -                       spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
> > > > -#endif
> > > > +                       spin_unlock(tdp_mmu_pages_lock);
> > > >
> > > >                 /*
> > > >                  * Do not attempt to recover any NX Huge Pages that are being
> > > > --
> > >
> > > LGTM! Thanks Sean.
> >
> > Is the compiler not smart enough to optimize out kvm->arch.tdp_mmu_pages_lock?
>
> Yes, the compiler will eliminate dead code at most optimization levels.  But that
> optimization phase happens after initial compilation, i.e. the compiler needs to
> generate the (probably intermediate?) code before it can trim away paths that are
> unreachable.
>
> > (To avoid needing the extra local variable?) I thought there was some other
> > KVM code that relied on similar optimizations but I would have to go dig them
> > up to remember.
>
> KVM, and the kernel, absolutely relies on dead code elimination.  KVM most blatantly
> uses the technique to avoid _defining_ stubs for code that is guarded by a Kconfig,
> e.g. all of these functions are defined in sev.c (guarded by CONFIG_KVM_AMD_SEV),
> but callers are guarded only with sev_guest() or sev_es_guest(), not with explicit
> #idefs.
>
> There are no build errors because the function calls aren't fully resolved until
> link time (when svm.o is linked into kvm-amd.o).  But KVM still needs to _declare_
> the functions, otherwise the compiler would fail during its initial code generation.
>
>   int pre_sev_run(struct vcpu_svm *svm, int cpu);
>   void sev_init_vmcb(struct vcpu_svm *svm);
>   void sev_vcpu_after_set_cpuid(struct vcpu_svm *svm);
>   int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
>   void sev_es_vcpu_reset(struct vcpu_svm *svm);
>   void sev_es_recalc_msr_intercepts(struct kvm_vcpu *vcpu);
>   void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
>   void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa);
>   void sev_es_unmap_ghcb(struct vcpu_svm *svm);
>
> Other notable "users" of dead code elimination are the BUILD_BUG_ON() family of
> compile-time asserts.  So long as the condition can be resolved to a constant
> false value during compile time, the "call" to __compiletime_error() will be
> elided and everyone is happy.
>
>   #ifdef __OPTIMIZE__
>   /*
>    * #ifdef __OPTIMIZE__ is only a good approximation; for instance "make
>    * CFLAGS_foo.o=-Og" defines __OPTIMIZE__, does not elide the conditional code
>    * and can break compilation with wrong error message(s). Combine with
>    * -U__OPTIMIZE__ when needed.
>    */
>   # define __compiletime_assert(condition, msg, prefix, suffix)         \
>         do {                                                            \
>                 /*                                                      \
>                  * __noreturn is needed to give the compiler enough     \
>                  * information to avoid certain possibly-uninitialized  \
>                  * warnings (regardless of the build failing).          \
>                  */                                                     \
>                 __noreturn extern void prefix ## suffix(void)           \
>                         __compiletime_error(msg);                       \
>                 if (!(condition))                                       \
>                         prefix ## suffix();                             \
>         } while (0)
>   #else
>   # define __compiletime_assert(condition, msg, prefix, suffix) ((void)(condition))
>   #endif
>
> Note, static_assert() is different in that it's a true assertion that's resolved
> early on during compilation.
>
>  * Contrary to BUILD_BUG_ON(), static_assert() can be used at global
>  * scope, but requires the expression to be an integer constant
>  * expression (i.e., it is not enough that __builtin_constant_p() is
>  * true for expr).
>
>
> From a previous thread related to asserts (https://lore.kernel.org/all/aFGY0KVUksf1a6xB@google.com):
>
>  : The advantage of BUILD_BUG_ON() is that it works so long as the condition is
>  : compile-time constant, whereas static_assert() requires the condition to an
>  : integer constant expression.  E.g. BUILD_BUG_ON() can be used so long as the
>  : condition is eventually resolved to a constant, whereas static_assert() has
>  : stricter requirements.
>  :
>  : E.g. the fls64() assert below is fully resolved at compile time, but isn't a
>  : purely constant expression, i.e. that one *needs* to be BUILD_BUG_ON().
>  :
>  : --
>  : arch/x86/kvm/svm/avic.c: In function ‘avic_init_backing_page’:
>  : arch/x86/kvm/svm/avic.c:293:45: error: expression in static assertion is not constant
>  :   293 |         static_assert(__PHYSICAL_MASK_SHIFT <=
>  : include/linux/build_bug.h:78:56: note: in definition of macro ‘__static_assert’
>  :    78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
>  :       |                                                        ^~~~
>  : arch/x86/kvm/svm/avic.c:293:9: note: in expansion of macro ‘static_assert’
>  :   293 |         static_assert(__PHYSICAL_MASK_SHIFT <=
>  :       |         ^~~~~~~~~~~~~
>  : make[5]: *** [scripts/Makefile.build:203: arch/x86/kvm/svm/avic.o] Error 1
>  : --
>  :
>  : The downside of BUILD_BUG_ON() is that it can't be used at global scope, i.e.
>  : needs to be called from a function.
>  :
>  : As a result, when adding an assertion in a function, using BUILD_BUG_ON() is
>  : slightly preferred, because it's less likely to break in the future.  E.g. if
>  : X2AVIC_MAX_PHYSICAL_ID were changed to something that is a compile-time constant,
>  : but for whatever reason isn't a pure integer constant.

Thank you so much for the detailed explanation!