lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aBAIL6oGYJ7IV85X@google.com>
Date: Mon, 28 Apr 2025 15:58:55 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Jim Mattson <jmattson@...gle.com>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org, 
	Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH v3 1/2] KVM: x86: Provide a capability to disable
 APERF/MPERF read intercepts

On Fri, Mar 21, 2025, Jim Mattson wrote:
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 2b52eb77e29c..6431cd33f06a 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -7684,6 +7684,7 @@ Valid bits in args[0] are::
>    #define KVM_X86_DISABLE_EXITS_HLT              (1 << 1)
>    #define KVM_X86_DISABLE_EXITS_PAUSE            (1 << 2)
>    #define KVM_X86_DISABLE_EXITS_CSTATE           (1 << 3)
> +  #define KVM_X86_DISABLE_EXITS_APERFMPERF       (1 << 4)

Might be pre-existing with C-states, but I think the documentation needs to call
out that userspace is responsible for enumerating APERFMPERF in guest CPUID.

And more importantly, KVM either needs to honor APERFMPERF in each vCPU's CPUID,
or the documentation needs to call out that KVM doesn't honor guest CPUID for 
APERF/MPERF MSRs.  I don't have a strong preference either way, but I'm leaning
toward having KVM honor CPUID so that if someone copy+pastes the KVM selftest   
code for the host enabling, it'll do the right thing.  On the other hand, KVM  
doesn't (and shouldn't) fully emulate the MSRs, so I'm a-ok if we ignore CPUID
entirely (but document it).

Ignoring CPUID entirely would also make it easier to document that KVM doesn't
upport loading/saving C-state or APERF/MPERF MSRs via load/store lists on VM-Enter
and VM-Exit.  E.g. we can simply say KVM doesn't emulate the MSRs in any capacity,
and that the capability disable the exit/interception, no more no less.

Heh, I guess maybe I've talked myself into having KVM ignore guest CPUID :-) 

> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index ea44c1da5a7c..5b38d5c00788 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -44,7 +44,7 @@ static inline struct page *__sme_pa_to_page(unsigned long pa)
>  #define	IOPM_SIZE PAGE_SIZE * 3
>  #define	MSRPM_SIZE PAGE_SIZE * 2
>  
> -#define MAX_DIRECT_ACCESS_MSRS	48
> +#define MAX_DIRECT_ACCESS_MSRS	50

Ugh, I really need to get the MSR interception cleanup series posted.

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4b64ab350bcd..1b3cdca806b4 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4535,6 +4535,9 @@ static u64 kvm_get_allowed_disable_exits(void)
>  {
>  	u64 r = KVM_X86_DISABLE_EXITS_PAUSE;
>  
> +	if (boot_cpu_has(X86_FEATURE_APERFMPERF))
> +		r |= KVM_X86_DISABLE_EXITS_APERFMPERF;
> +
>  	if (!mitigate_smt_rsb) {
>  		r |= KVM_X86_DISABLE_EXITS_HLT |
>  			KVM_X86_DISABLE_EXITS_CSTATE;
> @@ -6543,7 +6546,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  
>  		if (!mitigate_smt_rsb && boot_cpu_has_bug(X86_BUG_SMT_RSB) &&
>  		    cpu_smt_possible() &&
> -		    (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
> +		    (cap->args[0] & ~(KVM_X86_DISABLE_EXITS_PAUSE |
> +				      KVM_X86_DISABLE_EXITS_APERFMPERF)))
>  			pr_warn_once(SMT_RSB_MSG);
>  
>  		if (cap->args[0] & KVM_X86_DISABLE_EXITS_PAUSE)
> @@ -6554,6 +6558,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  			kvm->arch.hlt_in_guest = true;
>  		if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE)
>  			kvm->arch.cstate_in_guest = true;
> +		if (cap->args[0] & KVM_X86_DISABLE_EXITS_APERFMPERF)
> +			kvm->arch.aperfmperf_in_guest = true;

Rather that an ever-growing stream of a booleans, what about tracing the flags
as a u64 and providing a builder macro to generate the helper?  The latter is a
bit gratuitous, but this seems like the type of boilerplate that would be
embarassingly easy to screw up without anyone noticing.

Very lightly tested...

--
From: Sean Christopherson <seanjc@...gle.com>
Date: Mon, 28 Apr 2025 11:35:47 -0700
Subject: [PATCH] KVM: x86: Consolidate DISABLE_EXITS_xxx handling into a
 single kvm_arch field

Replace the individual xxx_in_guest booleans with a single field to track
exits that have been disabled for a VM.  To further cut down on the amount
of boilerplate needed for each disabled exit, add a builder macro to
generate the accessor.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@...gle.com>
---
 arch/x86/include/asm/kvm_host.h |  5 +----
 arch/x86/kvm/svm/svm.c          |  2 +-
 arch/x86/kvm/vmx/vmx.c          |  2 +-
 arch/x86/kvm/x86.c              | 25 ++++++++-----------------
 arch/x86/kvm/x86.h              | 28 +++++++++-------------------
 5 files changed, 20 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6c06f3d6e081..4b174499b29c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1389,10 +1389,7 @@ struct kvm_arch {
 
 	gpa_t wall_clock;
 
-	bool mwait_in_guest;
-	bool hlt_in_guest;
-	bool pause_in_guest;
-	bool cstate_in_guest;
+	u64 disabled_exits;
 
 	unsigned long irq_sources_bitmap;
 	s64 kvmclock_offset;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index cc1c721ba067..0f0c06be85d6 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5053,7 +5053,7 @@ static int svm_vm_init(struct kvm *kvm)
 	}
 
 	if (!pause_filter_count || !pause_filter_thresh)
-		kvm->arch.pause_in_guest = true;
+		kvm->arch.disabled_exits |= KVM_X86_DISABLE_EXITS_PAUSE;
 
 	if (enable_apicv) {
 		int ret = avic_vm_init(kvm);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ef2d7208dd20..109ade8fc47b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7613,7 +7613,7 @@ int vmx_vcpu_create(struct kvm_vcpu *vcpu)
 int vmx_vm_init(struct kvm *kvm)
 {
 	if (!ple_gap)
-		kvm->arch.pause_in_guest = true;
+		kvm->arch.disabled_exits |= KVM_X86_DISABLE_EXITS_PAUSE;
 
 	if (boot_cpu_has(X86_BUG_L1TF) && enable_ept) {
 		switch (l1tf_mitigation) {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f6ce044b090a..3800d6cfecce 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6591,27 +6591,18 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 			break;
 
 		mutex_lock(&kvm->lock);
-		if (kvm->created_vcpus)
-			goto disable_exits_unlock;
-
+		if (!kvm->created_vcpus) {
 #define SMT_RSB_MSG "This processor is affected by the Cross-Thread Return Predictions vulnerability. " \
 		    "KVM_CAP_X86_DISABLE_EXITS should only be used with SMT disabled or trusted guests."
 
-		if (!mitigate_smt_rsb && boot_cpu_has_bug(X86_BUG_SMT_RSB) &&
-		    cpu_smt_possible() &&
-		    (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
-			pr_warn_once(SMT_RSB_MSG);
+			if (!mitigate_smt_rsb && cpu_smt_possible() &&
+			    boot_cpu_has_bug(X86_BUG_SMT_RSB) &&
+			    (cap->args[0] & ~KVM_X86_DISABLE_EXITS_PAUSE))
+				pr_warn_once(SMT_RSB_MSG);
 
-		if (cap->args[0] & KVM_X86_DISABLE_EXITS_PAUSE)
-			kvm->arch.pause_in_guest = true;
-		if (cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT)
-			kvm->arch.mwait_in_guest = true;
-		if (cap->args[0] & KVM_X86_DISABLE_EXITS_HLT)
-			kvm->arch.hlt_in_guest = true;
-		if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE)
-			kvm->arch.cstate_in_guest = true;
-		r = 0;
-disable_exits_unlock:
+			kvm->arch.disabled_exits |= cap->args[0];
+			r = 0;
+		}
 		mutex_unlock(&kvm->lock);
 		break;
 	case KVM_CAP_MSR_PLATFORM_INFO:
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 88a9475899c8..1675017eea88 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -481,25 +481,15 @@ static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec)
 	    __rem;						\
 	 })
 
-static inline bool kvm_mwait_in_guest(struct kvm *kvm)
-{
-	return kvm->arch.mwait_in_guest;
-}
-
-static inline bool kvm_hlt_in_guest(struct kvm *kvm)
-{
-	return kvm->arch.hlt_in_guest;
-}
-
-static inline bool kvm_pause_in_guest(struct kvm *kvm)
-{
-	return kvm->arch.pause_in_guest;
-}
-
-static inline bool kvm_cstate_in_guest(struct kvm *kvm)
-{
-	return kvm->arch.cstate_in_guest;
-}
+#define BUILD_DISABLED_EXITS_HELPER(lname, uname)				\
+static inline bool kvm_##lname##_in_guest(struct kvm *kvm)			\
+{										\
+	return kvm->arch.disabled_exits & KVM_X86_DISABLE_EXITS_##uname;	\
+}
+BUILD_DISABLED_EXITS_HELPER(hlt, HLT);
+BUILD_DISABLED_EXITS_HELPER(pause, PAUSE);
+BUILD_DISABLED_EXITS_HELPER(mwait, MWAIT);
+BUILD_DISABLED_EXITS_HELPER(cstate, CSTATE);
 
 static inline bool kvm_notify_vmexit_enabled(struct kvm *kvm)
 {

base-commit: 45eb29140e68ffe8e93a5471006858a018480a45
--

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ