lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240725175232.337266-3-mlevitsk@redhat.com>
Date: Thu, 25 Jul 2024 13:52:32 -0400
From: Maxim Levitsky <mlevitsk@...hat.com>
To: kvm@...r.kernel.org
Cc: Sean Christopherson <seanjc@...gle.com>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	Borislav Petkov <bp@...en8.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	linux-kernel@...r.kernel.org,
	Paolo Bonzini <pbonzini@...hat.com>,
	x86@...nel.org,
	Maxim Levitsky <mlevitsk@...hat.com>
Subject: [PATCH v3 2/2] VMX: reset the segment cache after segment initialization in vmx_vcpu_reset

reset the segment cache after segment initialization in vmx_vcpu_reset
to avoid stale uninitialized data being cached in the segment cache.

In particular the following scenario is possible when full preemption
is enabled:

- vCPU is just created, and the vCPU thread is preempted before SS.AR_BYTES
is written in vmx_vcpu_reset.

- During preemption, the kvm_arch_vcpu_in_kernel is called which
reads SS's segment AR byte to determine if the CPU was in the kernel.

That caches 0 value of SS.AR_BYTES, then eventually the vCPU thread will be
preempted back, then set the correct SS.AR_BYTES value in the vmcs
and the cached value will remain stale, and could be read e.g via
KVM_GET_SREGS.

Usually this is not a problem because VMX segment cache is reset on each
vCPU run, but if the userspace (e.g KVM selftests do) reads the segment
registers just after the vCPU was created, and modifies some of them
but passes through other registers and in this case SS.AR_BYTES,
the stale value of it will make it into the vmcs,
and later lead to a VM entry failure due to incorrect SS segment type.

Fix this by moving the vmx_segment_cache_clear() call to be after the
segments are initialized.

Note that this still doesn't fix the issue of kvm_arch_vcpu_in_kernel
getting stale data during the segment setup, and that issue will
be addressed later.

Signed-off-by: Maxim Levitsky <mlevitsk@...hat.com>
---
 arch/x86/kvm/vmx/vmx.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index fa9f307d9b18..d43bb755e15c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4870,9 +4870,6 @@ void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	vmx->hv_deadline_tsc = -1;
 	kvm_set_cr8(vcpu, 0);
 
-	vmx_segment_cache_clear(vmx);
-	kvm_register_mark_available(vcpu, VCPU_EXREG_SEGMENTS);
-
 	seg_setup(VCPU_SREG_CS);
 	vmcs_write16(GUEST_CS_SELECTOR, 0xf000);
 	vmcs_writel(GUEST_CS_BASE, 0xffff0000ul);
@@ -4899,6 +4896,9 @@ void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	vmcs_writel(GUEST_IDTR_BASE, 0);
 	vmcs_write32(GUEST_IDTR_LIMIT, 0xffff);
 
+	vmx_segment_cache_clear(vmx);
+	kvm_register_mark_available(vcpu, VCPU_EXREG_SEGMENTS);
+
 	vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
 	vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, 0);
 	vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS, 0);
-- 
2.26.3


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ