lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 5 Apr 2016 13:28:53 +0200
From:	Paolo Bonzini <pbonzini@...hat.com>
To:	David Matlack <dmatlack@...gle.com>, kvm@...r.kernel.org
Cc:	linux-kernel@...r.kernel.org, luto@...capital.net,
	stable@...r.kernel.org
Subject: Re: [PATCH] kvm: x86: do not leak guest xcr0 into host interrupt
 handlers



On 30/03/2016 21:24, David Matlack wrote:
> An interrupt handler that uses the fpu can kill a KVM VM, if it runs
> under the following conditions:
>  - the guest's xcr0 register is loaded on the cpu
>  - the guest's fpu context is not loaded
>  - the host is using eagerfpu
> 
> Note that the guest's xcr0 register and fpu context are not loaded as
> part of the atomic world switch into "guest mode". They are loaded by
> KVM while the cpu is still in "host mode".
> 
> Usage of the fpu in interrupt context is gated by irq_fpu_usable(). The
> interrupt handler will look something like this:
> 
> if (irq_fpu_usable()) {
>         kernel_fpu_begin();
> 
>         [... code that uses the fpu ...]
> 
>         kernel_fpu_end();
> }
> 
> As long as the guest's fpu is not loaded and the host is using eager
> fpu, irq_fpu_usable() returns true (interrupted_kernel_fpu_idle()
> returns true). The interrupt handler proceeds to use the fpu with
> the guest's xcr0 live.
> 
> kernel_fpu_begin() saves the current fpu context. If this uses
> XSAVE[OPT], it may leave the xsave area in an undesirable state.
> According to the SDM, during XSAVE bit i of XSTATE_BV is not modified
> if bit i is 0 in xcr0. So it's possible that XSTATE_BV[i] == 1 and
> xcr0[i] == 0 following an XSAVE.
> 
> kernel_fpu_end() restores the fpu context. Now if any bit i in
> XSTATE_BV == 1 while xcr0[i] == 0, XRSTOR generates a #GP. The
> fault is trapped and SIGSEGV is delivered to the current process.
> 
> Only pre-4.2 kernels appear to be vulnerable to this sequence of
> events. Commit 653f52c ("kvm,x86: load guest FPU context more eagerly")
> from 4.2 forces the guest's fpu to always be loaded on eagerfpu hosts.
> 
> This patch fixes the bug by keeping the host's xcr0 loaded outside
> of the interrupts-disabled region where KVM switches into guest mode.
> 
> Cc: stable@...r.kernel.org
> Suggested-by: Andy Lutomirski <luto@...capital.net>
> Signed-off-by: David Matlack <dmatlack@...gle.com>
> ---
>  arch/x86/kvm/x86.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)

While running my acceptance tests, in one case I got one CPU whose xcr0 
had leaked into the host.  This showed up as a SIGILL in strncasecmp's 
AVX code, and a simple program confirmed it:

    $ cat xgetbv.c
    #include <stdio.h>
    int main(void)
    {
        unsigned xcr0_h, xcr0_l;
        asm("xgetbv" : "=d"(xcr0_h), "=a"(xcr0_l) : "c"(0));
        printf("%08x:%08x\n", xcr0_h, xcr0_l);
    }
    $ gcc xgetbv.c -O2
    $ for i in `seq 0 55`; do echo $i `taskset -c $i ./a.out`; done|grep -v 007
    19 00000000:00000003

I'm going to rerun the tests without this patch, as it seems the most
likely culprit, and leave it out of the pull request if they pass.

Paolo

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e260ccb..8df1167 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -700,7 +700,6 @@ static int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr)
>  		if ((xcr0 & XFEATURE_MASK_AVX512) != XFEATURE_MASK_AVX512)
>  			return 1;
>  	}
> -	kvm_put_guest_xcr0(vcpu);
>  	vcpu->arch.xcr0 = xcr0;
>  
>  	if ((xcr0 ^ old_xcr0) & XFEATURE_MASK_EXTEND)
> @@ -6590,8 +6589,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  	kvm_x86_ops->prepare_guest_switch(vcpu);
>  	if (vcpu->fpu_active)
>  		kvm_load_guest_fpu(vcpu);
> -	kvm_load_guest_xcr0(vcpu);
> -
>  	vcpu->mode = IN_GUEST_MODE;
>  
>  	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
> @@ -6607,6 +6604,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  
>  	local_irq_disable();
>  
> +	kvm_load_guest_xcr0(vcpu);
> +
>  	if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests
>  	    || need_resched() || signal_pending(current)) {
>  		vcpu->mode = OUTSIDE_GUEST_MODE;
> @@ -6667,6 +6666,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  	vcpu->mode = OUTSIDE_GUEST_MODE;
>  	smp_wmb();
>  
> +	kvm_put_guest_xcr0(vcpu);
> +
>  	/* Interrupt is enabled by handle_external_intr() */
>  	kvm_x86_ops->handle_external_intr(vcpu);
>  
> @@ -7314,7 +7315,6 @@ void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
>  	 * and assume host would use all available bits.
>  	 * Guest xcr0 would be loaded later.
>  	 */
> -	kvm_put_guest_xcr0(vcpu);
>  	vcpu->guest_fpu_loaded = 1;
>  	__kernel_fpu_begin();
>  	__copy_kernel_to_fpregs(&vcpu->arch.guest_fpu.state);
> @@ -7323,8 +7323,6 @@ void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
>  
>  void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
>  {
> -	kvm_put_guest_xcr0(vcpu);
> -
>  	if (!vcpu->guest_fpu_loaded) {
>  		vcpu->fpu_counter = 0;
>  		return;
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ