linux-kernel - Re: [PATCH v2 23/49] KVM: x86: Handle kernel- and KVM-defined CPUID words in a single helper

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZqKlDC11gItH1uj9@google.com>
Date: Thu, 25 Jul 2024 12:18:36 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Maxim Levitsky <mlevitsk@...hat.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Vitaly Kuznetsov <vkuznets@...hat.com>, kvm@...r.kernel.org, 
	linux-kernel@...r.kernel.org, Hou Wenlong <houwenlong.hwl@...group.com>, 
	Kechen Lu <kechenl@...dia.com>, Oliver Upton <oliver.upton@...ux.dev>, 
	Binbin Wu <binbin.wu@...ux.intel.com>, Yang Weijiang <weijiang.yang@...el.com>, 
	Robert Hoo <robert.hoo.linux@...il.com>
Subject: Re: [PATCH v2 23/49] KVM: x86: Handle kernel- and KVM-defined CPUID
 words in a single helper

On Wed, Jul 24, 2024, Maxim Levitsky wrote:
> On Mon, 2024-07-08 at 14:18 -0700, Sean Christopherson wrote:
> >  And in the unlikely case that we royally screw up and fail
> > to call kvm_cpu_cap_init() on a word, starting with 0xff would result in all
> > features in the uninitialized word being treated as supported.
> Yes, but IMHO the chances of this happening are very low.
> 
> I understand your concerns though, but then IMHO it's better to keep the
> kvm_cpu_cap_init_kvm_defined, because this way at least the function name
> cleanly describes the difference instead of the difference being buried in the function
> itself (the comment helps but still it is less noticeable than a function name). 
> 
> I don't have a very strong opinion on this though, 
> because IMHO the kvm_cpu_cap_init_kvm_defined is also not very user friendly, 
> so if you really think that the new code is more readable, let it be.

Hmm, the main motiviation of this patch was to avoid duplicate code in later
patches, but looking at the end result, I don't think that eliminating the
KVM-defined variants is necessary, e.g. ending up with this should work, too.

#define __kvm_cpu_cap_init(leaf)					\
do {									\
	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
	u32 kvm_cpu_cap_emulated = 0;					\
	u32 kvm_cpu_cap_synthesized = 0;				\
									\
	kvm_cpu_caps[leaf] &= (raw_cpuid_get(cpuid) |			\
			       kvm_cpu_cap_synthesized);		\
	kvm_cpu_caps[leaf] |= kvm_cpu_cap_emulated;			\
} while (0)

/* For kernel-defined leafs, mask the boot CPU's pre-populated value. */
#define kvm_cpu_cap_init(leaf, mask)					\
do {
	BUILD_BUG_ON(leaf >= NCAPINTS);					\
	kvm_cpu_caps[leaf] &= (mask);					\
									\
	__kvm_cpu_cap_init(leaf);					\
} while (0)

/* For KVM-defined leafs, explicitly set the leaf, KVM is the sole authority. */
#define kvm_cpu_cap_init_kvm_defined(leaf, mask)			\
do {									\
	BUILD_BUG_ON(leaf < NCAPINTS);					\
	kvm_cpu_caps[leaf] = (mask);					\
									\
	__kvm_cpu_cap_init(leaf);					\
} while (0)

That said, unless someone really likes kvm_cpu_cap_init_kvm_defined(), I am
leaning toward keeping this patch (but rewriting the changelog).  IMO, whether a
leaf is KVM-only or known to the kernel is a plumbing detail that really shouldn't
affect anything in kvm_set_cpu_caps().  Literally the only difference is whether
or not there are kernel capabilities to account for.  The "types" of features isn't
restricted in any way, e.g. CPUID_12_EAX is KVM-only and contains only scattered
features, but CPUID_7_1_EDX is KVM-only and contains only "regular" features.

And if a feature changes from KVM-only to kernel-managed, we'd need to update the
caller.  This is unlikely, but it seems like an unnecessary maintenance burden.

Ooh, and thinking more on that and on the argument against initializing the KVM-
only leafs to all ones, I think we should remove this:

	memcpy(&kvm_cpu_caps, &boot_cpu_data.x86_capability,
	       sizeof(kvm_cpu_caps) - (NKVMCAPINTS * sizeof(*kvm_cpu_caps)));

and instead explicitly mask the boot_cpu_data.x86_capability[leaf].  It's _way_
more likely that the kernel adds a leaf without updating KVM, in which case
copying the kernel capabilities without masking them against KVM's capabilities
would over-report the set of supported features.  The odds of over-reproring are
still low, as KVM limit the max leaf in __do_cpuid_func(), but unless I'm missing
something, the memcpy() trick adds no value in the current code base.

E.g. 

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index dbc3f6ce9203..593de2c1811b 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -730,18 +730,20 @@ do {                                                                      \
 } while (0)
 
 /*
- * For kernel-defined leafs, mask the boot CPU's pre-populated value.  For KVM-
- * defined leafs, explicitly set the leaf, as KVM is the one and only authority.
+ * For leafs that are managed by the kernel, mask the boot CPU's capabilities,
+ * which are populated by the kernel.  For KVM-only leafs, as KVM is the one
+ * and only authority.
  */
 #define kvm_cpu_cap_init(leaf, mask)                                   \
 do {                                                                   \
        const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);    \
        const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;   \
+       const u32 kernel_cpu_caps = boot_cpu_data.x86_capability[leaf]; \
        u32 kvm_cpu_cap_emulated = 0;                                   \
        u32 kvm_cpu_cap_synthesized = 0;                                \
                                                                        \
        if (leaf < NCAPINTS)                                            \
-               kvm_cpu_caps[leaf] &= (mask);                           \
+               kvm_cpu_caps[leaf] = kernel_cpu_caps & (mask);          \
        else                                                            \
                kvm_cpu_caps[leaf] = (mask);                            \
                                                                        \
@@ -763,9 +765,6 @@ void kvm_set_cpu_caps(void)
        BUILD_BUG_ON(sizeof(kvm_cpu_caps) - (NKVMCAPINTS * sizeof(*kvm_cpu_caps)) >
                     sizeof(boot_cpu_data.x86_capability));
 
-       memcpy(&kvm_cpu_caps, &boot_cpu_data.x86_capability,
-              sizeof(kvm_cpu_caps) - (NKVMCAPINTS * sizeof(*kvm_cpu_caps)));
-
        kvm_cpu_cap_init(CPUID_1_ECX,
                /*
                 * NOTE: MONITOR (and MWAIT) are emulated as NOP, but *not*