[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZrF55uIvX2rcHtSW@chao-email>
Date: Tue, 6 Aug 2024 09:18:30 +0800
From: Chao Gao <chao.gao@...el.com>
To: Maxim Levitsky <mlevitsk@...hat.com>
CC: <kvm@...r.kernel.org>, Sean Christopherson <seanjc@...gle.com>, "Dave
Hansen" <dave.hansen@...ux.intel.com>, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, "H. Peter
Anvin" <hpa@...or.com>, <linux-kernel@...r.kernel.org>, Paolo Bonzini
<pbonzini@...hat.com>, <x86@...nel.org>
Subject: Re: [PATCH v3 2/2] VMX: reset the segment cache after segment
initialization in vmx_vcpu_reset
On Thu, Jul 25, 2024 at 01:52:32PM -0400, Maxim Levitsky wrote:
KVM: VMX:
>reset the segment cache after segment initialization in vmx_vcpu_reset
>to avoid stale uninitialized data being cached in the segment cache.
>
>In particular the following scenario is possible when full preemption
>is enabled:
>
>- vCPU is just created, and the vCPU thread is preempted before SS.AR_BYTES
>is written in vmx_vcpu_reset.
>
>- During preemption, the kvm_arch_vcpu_in_kernel is called which
>reads SS's segment AR byte to determine if the CPU was in the kernel.
>
>That caches 0 value of SS.AR_BYTES, then eventually the vCPU thread will be
>preempted back, then set the correct SS.AR_BYTES value in the vmcs
>and the cached value will remain stale, and could be read e.g via
>KVM_GET_SREGS.
>
>Usually this is not a problem because VMX segment cache is reset on each
>vCPU run, but if the userspace (e.g KVM selftests do) reads the segment
>registers just after the vCPU was created, and modifies some of them
>but passes through other registers and in this case SS.AR_BYTES,
>the stale value of it will make it into the vmcs,
>and later lead to a VM entry failure due to incorrect SS segment type.
I looked into the same issue last week, which was reported by someone
internally.
>
>Fix this by moving the vmx_segment_cache_clear() call to be after the
>segments are initialized.
>
>Note that this still doesn't fix the issue of kvm_arch_vcpu_in_kernel
>getting stale data during the segment setup, and that issue will
>be addressed later.
>
>Signed-off-by: Maxim Levitsky <mlevitsk@...hat.com>
Do you need a Fixes tag and/or Cc: stable?
>---
> arch/x86/kvm/vmx/vmx.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
>diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>index fa9f307d9b18..d43bb755e15c 100644
>--- a/arch/x86/kvm/vmx/vmx.c
>+++ b/arch/x86/kvm/vmx/vmx.c
>@@ -4870,9 +4870,6 @@ void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
> vmx->hv_deadline_tsc = -1;
> kvm_set_cr8(vcpu, 0);
>
>- vmx_segment_cache_clear(vmx);
>- kvm_register_mark_available(vcpu, VCPU_EXREG_SEGMENTS);
>-
> seg_setup(VCPU_SREG_CS);
> vmcs_write16(GUEST_CS_SELECTOR, 0xf000);
> vmcs_writel(GUEST_CS_BASE, 0xffff0000ul);
>@@ -4899,6 +4896,9 @@ void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
> vmcs_writel(GUEST_IDTR_BASE, 0);
> vmcs_write32(GUEST_IDTR_LIMIT, 0xffff);
>
>+ vmx_segment_cache_clear(vmx);
>+ kvm_register_mark_available(vcpu, VCPU_EXREG_SEGMENTS);
vmx_segment_cache_clear() is called in a few other sites. I think at least the
call in __vmx_set_segment() should be fixed, because QEMU may read SS.AR right
after a write to it. if the write was preempted after the cache was cleared but
before the new value being written into VMCS, QEMU would find that SS.AR held a
stale value.
>+
> vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
> vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, 0);
> vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS, 0);
>--
>2.26.3
>
>
Powered by blists - more mailing lists