lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <afcd052fcf583526f72b80a779dc50f49cd0ae95.camel@redhat.com>
Date:   Wed, 28 Oct 2020 10:51:16 +0200
From:   Maxim Levitsky <mlevitsk@...hat.com>
To:     Qian Cai <cai@...hat.com>, kvm@...r.kernel.org
Cc:     "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
        Borislav Petkov <bp@...en8.de>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        Joerg Roedel <joro@...tes.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org, Jim Mattson <jmattson@...gle.com>,
        "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v6 2/4] KVM: x86: report negative values from wrmsr
 emulation to userspace

On Tue, 2020-10-27 at 16:31 -0400, Qian Cai wrote:
> On Mon, 2020-10-26 at 15:40 -0400, Qian Cai wrote:
> > On Wed, 2020-09-23 at 00:10 +0300, Maxim Levitsky wrote:
> > > This will allow the KVM to report such errors (e.g -ENOMEM)
> > > to the userspace.
> > > 
> > > Signed-off-by: Maxim Levitsky <mlevitsk@...hat.com>
> > 
> > Reverting this and its dependency:
> > 
> > 72f211ecaa80 KVM: x86: allow kvm_x86_ops.set_efer to return an error value
> > 
> > on the top of linux-next (they have also unfortunately merged into the
> > mainline
> > at the same time) fixed an issue that a simple Intel KVM guest is unable to
> > boot
> > below.
> 
> So I debug this a bit more. This also breaks nested virt (VMX). We have here:
> 
> [  345.504403] kvm [1491]: vcpu0 unhandled rdmsr: 0x4e data 0x0
> [  345.758560] kvm [1491]: vcpu0 unhandled rdmsr: 0x1c9 data 0x0
> [  345.758594] kvm [1491]: vcpu0 unhandled rdmsr: 0x1a6 data 0x0
> [  345.758619] kvm [1491]: vcpu0 unhandled rdmsr: 0x1a7 data 0x0
> [  345.758644] kvm [1491]: vcpu0 unhandled rdmsr: 0x3f6 data 0x0
> [  345.951601] kvm [1493]: vcpu1 unhandled rdmsr: 0x4e data 0x0
> [  351.857036] kvm [1493]: vcpu1 unhandled wrmsr: 0xc90 data 0xfffff
> 
> After this commit, -ENOENT is returned to vcpu_enter_guest() causes the
> userspace to abort.

Thank you very much for debugging it!

Yestarday I tried pretty much everything to reproduce it on my intel's laptop 
but I wasn't able.

I tried kvm/queue, then latest mainline, linux-next on my laptop
and all worked and even booted nested guests.

For qemu side,
I even tried RHEL's qemu, exact version as you tested.

I got really unlucky here that it seems that none of my guests ever write
an unknown msr.
Now with the information you provided, it is trivial to reproduce it 
even on my AMD machine -
All I need to do is to write an unknown msr, 
something like 'wrmsr 0x1234 0x0' using wrmsr tool.

And for the root cause of this, this is the fallout of last minute rebase of my code on top
of the userspace msr feature. I missed this -ENOENT logic that clashes with mine.

> 
> kvm_msr_ignored_check()
>   kvm_set_msr()
>     kvm_emulate_wrmsr()
>       vmx_handle_exit()
>         vcpu_enter_guest()
> 
> Something like below will unbreak the userspace, but does anyone has a better
> idea?

Checking for -ENOENT might be a right solution but I'll check now in depth,
what else interactions are affected and if this can be done closer to the
place where it happens.

Also, I am more inclined to add a new positive error code (similiar to KVM_MSR_RET_INVALID)
to indicate 'msr not found' error since this error condition is arch defined.

My reasoning is that positive error values should be used for error conditions
that cause #GP to the guest, while negative error values should only be used
for internal errors which are propogated to qemu userspace and usually kill
the guest.

I'll prepare a patch for this very soon.

Best regards,
	Maxim Levitsky 


> 
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1748,7 +1748,7 @@ int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu)
>                 return 0;
>  
>         /* Signal all other negative errors to userspace */
> -       if (r < 0)
> +       if (r < 0 && r != -ENOENT)
>                 return r;
>  
>         /* MSR write failed? Inject a #GP */
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ