[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CALCETrWxBW-f_YcRyO8jH-LNnot-4GjEFAFoqzY87M04EZTBzA@mail.gmail.com>
Date: Wed, 10 Jun 2020 17:15:34 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Sean Christopherson <sean.j.christopherson@...el.com>
Cc: "David P. Reed" <dpreed@...pplum.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
X86 ML <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
Allison Randal <allison@...utok.net>,
Enrico Weigelt <info@...ux.net>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Kate Stewart <kstewart@...uxfoundation.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Randy Dunlap <rdunlap@...radead.org>,
Martin Molnar <martin.molnar.programming@...il.com>,
Andy Lutomirski <luto@...nel.org>,
Alexandre Chartre <alexandre.chartre@...cle.com>,
Jann Horn <jannh@...gle.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] Fix undefined operation VMXOFF during reboot and crash
On Wed, Jun 10, 2020 at 5:00 PM Sean Christopherson
<sean.j.christopherson@...el.com> wrote:
>
> On Wed, Jun 10, 2020 at 02:59:19PM -0700, Andy Lutomirski wrote:
> >
> >
> > > On Jun 10, 2020, at 11:21 AM, David P. Reed <dpreed@...pplum.com> wrote:
> > >
> > > If a panic/reboot occurs when CR4 has VMX enabled, a VMXOFF is
> > > done on all CPUS, to allow the INIT IPI to function, since
> > > INIT is suppressed when CPUs are in VMX root operation.
> > > However, VMXOFF causes an undefined operation fault if the CPU is not
> > > in VMX operation, that is, VMXON has not been executed, or VMXOFF
> > > has been executed, but VMX is enabled.
> >
> > I’m surprised. Wouldn’t this mean that emergency reboots always fail it a VM
> > is running? I would think someone would have noticed before.
>
> The call to cpu_vmxoff() is conditioned on CR4.VMXE==1, which KVM toggles in
> tandem with VMXON and VMXOFF. Out of tree hypervisors presumably do the
> same. That's obviously not atomic though, e.g. VMXOFF will #UD if the
> vmxoff_nmi() NMI arrives between CR4.VMXE=1 and VMXON, or between VMXOFF
> and CR4.VMXE=0.
It would be nice for the commit message to say "this happens when
nmxoff_nmi() races with KVM's VMXON/VMXOFF toggling". Or the commit
message should say something else if the bug happens for a different
reason.
The race with KVM should be quite unusual, since it involves rebooting
concurrently with loading or unloading KVM.
Powered by blists - more mailing lists