linux-kernel - Re: [PATCH] Fix undefined operation VMXOFF during reboot and crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrWw3WKLx1k94NfH1jJm-XLid_G-zy8jz_Afdf3KkWjquw@mail.gmail.com>
Date:   Thu, 11 Jun 2020 10:02:39 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     Sean Christopherson <sean.j.christopherson@...el.com>
Cc:     "David P. Reed" <dpreed@...pplum.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        X86 ML <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
        Allison Randal <allison@...utok.net>,
        Enrico Weigelt <info@...ux.net>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Kate Stewart <kstewart@...uxfoundation.org>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Randy Dunlap <rdunlap@...radead.org>,
        Martin Molnar <martin.molnar.programming@...il.com>,
        Andy Lutomirski <luto@...nel.org>,
        Alexandre Chartre <alexandre.chartre@...cle.com>,
        Jann Horn <jannh@...gle.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] Fix undefined operation VMXOFF during reboot and crash

On Thu, Jun 11, 2020 at 10:00 AM Sean Christopherson
<sean.j.christopherson@...el.com> wrote:
>
> On Thu, Jun 11, 2020 at 12:33:20PM -0400, David P. Reed wrote:
> > To respond to Thomas Gleixner's suggestion about exception masking mechanism
> > - it may well be a better fix, but a) I used "BUG" as a model, and b) the
> > exception masking is undocumented anywhere I can find. These are "static
> > inline" routines, and only the "emergency" version needs protection, because
> > you'd want a random VMXOFF to actually trap.
>
> The only in-kernel usage of cpu_vmxoff() are for emergencies.  And, the only
> reasonable source of faults on VMXOFF is that VMX is already off, i.e. for
> the kernel's usage, the goal is purely to ensure VMX is disabled, how we get
> there doesn't truly matter.
>
> > In at least one of the calls to emergency, it is stated that no locks may be
> > taken at all because of where it was.
> >
> > Further, I have a different patch that requires a scratch page per processor
> > to exist, but which never takes a UD fault. (basically, it attempts VMXON
> > first, and then does VMXOFF after VMXON, which ensures exit from VMX root
> > mode, but VMXON needs a blank page to either succeed or fail without GP
> > fault). If someone prefers that, it's local to the routine, but requires a
> > new scratch page per processor be allocated. So after testing it, I decided
> > in the interest of memory reduction that the masking of UD was preferable.
>
> Please no, doing VMXON, even temporarily, could cause breakage.  The CPU's
> VMCS cache isn't cleared on VMXOFF.  Doing VMXON after kdump_nmi_callback()
> invokes cpu_crash_vmclear_loaded_vmcss() would create a window where VMPTRLD
> could succeed in a hypervisor and lead to memory corruption in the new
> kernel when the VMCS is evicted from the non-coherent VMCS cache.
>
> > I'm happy to resubmit the masking exception patch as version 2, if it works
> > in my test case.
> >
> > Advice?
>
> Please test the below, which simply eats any exception on VMXOFF.
>
> diff --git a/arch/x86/include/asm/virtext.h b/arch/x86/include/asm/virtext.h
> index 9aad0e0876fb..54bc84d7028d 100644
> --- a/arch/x86/include/asm/virtext.h
> +++ b/arch/x86/include/asm/virtext.h
> @@ -32,13 +32,15 @@ static inline int cpu_has_vmx(void)
>
>  /** Disable VMX on the current CPU
>   *
> - * vmxoff causes a undefined-opcode exception if vmxon was not run
> - * on the CPU previously. Only call this function if you know VMX
> - * is enabled.
> + * VMXOFF causes a #UD if the CPU is not post-VMXON, eat any #UDs to handle
> + * races with a hypervisor doing VMXOFF, e.g. if an NMI arrived between VMXOFF
> + * and clearing CR4.VMXE.
>   */
>  static inline void cpu_vmxoff(void)
>  {
> -       asm volatile ("vmxoff");
> +       asm volatile("1: vmxoff\n\t"
> +                    "2:\n\t"
> +                    _ASM_EXTABLE(1b, 2b));
>         cr4_clear_bits(X86_CR4_VMXE);
>  }

I think that just eating exceptions like this is asking for trouble.
How about having a separate cpu_emergency_vmxoff() that eats
exceptions and leaving cpu_vmxoff() alone?  Or make cpu_vmxoff()
return an error on failure and have the normal caller WARN if there's
an error.

Silently eating exceptions in the non-emergency path makes it too easy
to regress something without noticing.