[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWw3WKLx1k94NfH1jJm-XLid_G-zy8jz_Afdf3KkWjquw@mail.gmail.com>
Date: Thu, 11 Jun 2020 10:02:39 -0700
From: Andy Lutomirski <luto@...nel.org>
To: Sean Christopherson <sean.j.christopherson@...el.com>
Cc: "David P. Reed" <dpreed@...pplum.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
X86 ML <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
Allison Randal <allison@...utok.net>,
Enrico Weigelt <info@...ux.net>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Kate Stewart <kstewart@...uxfoundation.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Randy Dunlap <rdunlap@...radead.org>,
Martin Molnar <martin.molnar.programming@...il.com>,
Andy Lutomirski <luto@...nel.org>,
Alexandre Chartre <alexandre.chartre@...cle.com>,
Jann Horn <jannh@...gle.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] Fix undefined operation VMXOFF during reboot and crash
On Thu, Jun 11, 2020 at 10:00 AM Sean Christopherson
<sean.j.christopherson@...el.com> wrote:
>
> On Thu, Jun 11, 2020 at 12:33:20PM -0400, David P. Reed wrote:
> > To respond to Thomas Gleixner's suggestion about exception masking mechanism
> > - it may well be a better fix, but a) I used "BUG" as a model, and b) the
> > exception masking is undocumented anywhere I can find. These are "static
> > inline" routines, and only the "emergency" version needs protection, because
> > you'd want a random VMXOFF to actually trap.
>
> The only in-kernel usage of cpu_vmxoff() are for emergencies. And, the only
> reasonable source of faults on VMXOFF is that VMX is already off, i.e. for
> the kernel's usage, the goal is purely to ensure VMX is disabled, how we get
> there doesn't truly matter.
>
> > In at least one of the calls to emergency, it is stated that no locks may be
> > taken at all because of where it was.
> >
> > Further, I have a different patch that requires a scratch page per processor
> > to exist, but which never takes a UD fault. (basically, it attempts VMXON
> > first, and then does VMXOFF after VMXON, which ensures exit from VMX root
> > mode, but VMXON needs a blank page to either succeed or fail without GP
> > fault). If someone prefers that, it's local to the routine, but requires a
> > new scratch page per processor be allocated. So after testing it, I decided
> > in the interest of memory reduction that the masking of UD was preferable.
>
> Please no, doing VMXON, even temporarily, could cause breakage. The CPU's
> VMCS cache isn't cleared on VMXOFF. Doing VMXON after kdump_nmi_callback()
> invokes cpu_crash_vmclear_loaded_vmcss() would create a window where VMPTRLD
> could succeed in a hypervisor and lead to memory corruption in the new
> kernel when the VMCS is evicted from the non-coherent VMCS cache.
>
> > I'm happy to resubmit the masking exception patch as version 2, if it works
> > in my test case.
> >
> > Advice?
>
> Please test the below, which simply eats any exception on VMXOFF.
>
> diff --git a/arch/x86/include/asm/virtext.h b/arch/x86/include/asm/virtext.h
> index 9aad0e0876fb..54bc84d7028d 100644
> --- a/arch/x86/include/asm/virtext.h
> +++ b/arch/x86/include/asm/virtext.h
> @@ -32,13 +32,15 @@ static inline int cpu_has_vmx(void)
>
> /** Disable VMX on the current CPU
> *
> - * vmxoff causes a undefined-opcode exception if vmxon was not run
> - * on the CPU previously. Only call this function if you know VMX
> - * is enabled.
> + * VMXOFF causes a #UD if the CPU is not post-VMXON, eat any #UDs to handle
> + * races with a hypervisor doing VMXOFF, e.g. if an NMI arrived between VMXOFF
> + * and clearing CR4.VMXE.
> */
> static inline void cpu_vmxoff(void)
> {
> - asm volatile ("vmxoff");
> + asm volatile("1: vmxoff\n\t"
> + "2:\n\t"
> + _ASM_EXTABLE(1b, 2b));
> cr4_clear_bits(X86_CR4_VMXE);
> }
I think that just eating exceptions like this is asking for trouble.
How about having a separate cpu_emergency_vmxoff() that eats
exceptions and leaving cpu_vmxoff() alone? Or make cpu_vmxoff()
return an error on failure and have the normal caller WARN if there's
an error.
Silently eating exceptions in the non-emergency path makes it too easy
to regress something without noticing.
Powered by blists - more mailing lists