linux-kernel - Re: [PATCH] Fix undefined operation VMXOFF during reboot and crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CALCETrWxBW-f_YcRyO8jH-LNnot-4GjEFAFoqzY87M04EZTBzA@mail.gmail.com>
Date:   Wed, 10 Jun 2020 17:15:34 -0700
From:   Andy Lutomirski <luto@...capital.net>
To:     Sean Christopherson <sean.j.christopherson@...el.com>
Cc:     "David P. Reed" <dpreed@...pplum.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        X86 ML <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
        Allison Randal <allison@...utok.net>,
        Enrico Weigelt <info@...ux.net>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Kate Stewart <kstewart@...uxfoundation.org>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Randy Dunlap <rdunlap@...radead.org>,
        Martin Molnar <martin.molnar.programming@...il.com>,
        Andy Lutomirski <luto@...nel.org>,
        Alexandre Chartre <alexandre.chartre@...cle.com>,
        Jann Horn <jannh@...gle.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] Fix undefined operation VMXOFF during reboot and crash

On Wed, Jun 10, 2020 at 5:00 PM Sean Christopherson
<sean.j.christopherson@...el.com> wrote:
>
> On Wed, Jun 10, 2020 at 02:59:19PM -0700, Andy Lutomirski wrote:
> >
> >
> > > On Jun 10, 2020, at 11:21 AM, David P. Reed <dpreed@...pplum.com> wrote:
> > >
> > > If a panic/reboot occurs when CR4 has VMX enabled, a VMXOFF is
> > > done on all CPUS, to allow the INIT IPI to function, since
> > > INIT is suppressed when CPUs are in VMX root operation.
> > > However, VMXOFF causes an undefined operation fault if the CPU is not
> > > in VMX operation, that is, VMXON has not been executed, or VMXOFF
> > > has been executed, but VMX is enabled.
> >
> > I’m surprised. Wouldn’t this mean that emergency reboots always fail it a VM
> > is running?  I would think someone would have noticed before.
>
> The call to cpu_vmxoff() is conditioned on CR4.VMXE==1, which KVM toggles in
> tandem with VMXON and VMXOFF.  Out of tree hypervisors presumably do the
> same.  That's obviously not atomic though, e.g. VMXOFF will #UD if the
> vmxoff_nmi() NMI arrives between CR4.VMXE=1 and VMXON, or between VMXOFF
> and CR4.VMXE=0.

It would be nice for the commit message to say "this happens when
nmxoff_nmi() races with KVM's VMXON/VMXOFF toggling".  Or the commit
message should say something else if the bug happens for a different
reason.

The race with KVM should be quite unusual, since it involves rebooting
concurrently with loading or unloading KVM.