linux-kernel - Re: [PATCH] x86/mm: Don't try to change poison pages to uncacheable in a guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200518153625.GA31444@agluck-desk2.amr.corp.intel.com>
Date:   Mon, 18 May 2020 08:36:25 -0700
From:   "Luck, Tony" <tony.luck@...el.com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     Jue Wang <juew@...gle.com>,
        "Williams, Dan J" <dan.j.williams@...el.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86/mm: Don't try to change poison pages to uncacheable
 in a guest

On Mon, May 18, 2020 at 03:48:13PM +0200, Borislav Petkov wrote:
> Hi,
> 
> lemme try to reply to three emails at once.
> 
> First of all, the two of you: pls do not top-post.

Sorry. Phone e-mail client is dumb.

> On Sat, May 16, 2020 at 6:52 PM Luck, Tony <tony.luck@...el.com> wrote:
> > But the guest isn’t likely to do the right thing with a page fault.
> > The guest just accessed a page that it knows is poisoned (VMM just told
> > it once that it was poisoned). There is no reason that the VMM should
> > let the guest actually touch the poison a second time. But if the guest
> > does, then the guest should get the expected response. I.e. another
> > machine check.
> 
> So Jue says below that the VMM has unmapped the guest page from the EPT.
> So the guest cannot touch the poison anymore.
> 
> How is then possible for the guest to touch it again if the page is not
> mapped anymore?

The VMM wants to make sure that the guest can't touch the poison
(this is important because not every touch of poison results in a
recoverable machine check. If the guest's next touch is an unaligned
access that crosses from the poison cache line to a non-poisoned line
then h/w will signal a fatal machinecheck and the whole machine will
go down).

> The guest should get a #PF when the page is unmapped and cause a new
> page to be mapped.

The VMM gets the page fault (because the unmapping of the guest
physical address is at the VMM EPT level).  The VMM can't map a new
page into that guest physical address because it has no way to
replace the contents of the old page.  The VMM could pass the #PF
to the guest, but that would just confuse the guest (its page tables
all say that the page is still valid). In this particular case the
page is part of the 1:1 kernel map. So the kernel will OOPS (I think).

> On Sun, May 17, 2020 at 07:36:00AM -0700, Jue Wang wrote:
> > The stack is from guest MCE handler's processing of the first MCE injected.
> 
> Aha, so you've flipped the functions order in the trace. It all starts
> at
> 
>   set_mce_nospec(m->addr >> PAGE_SHIFT);
> 
> Now it makes sense.
> 
> > Note before the first MCE is injected into guest (i.e., after the host MCE
> > handler successfully finished MCE handling and notified VMM via SIGBUS with
> > BUS_MCEERR_AR), VMM unmaps the guest page from EPT.
> 
> Ok, good.
> 
> > The guest MCE handler finished the "normal" MCE handling and recovery
> > (memory_failure() in mm/memory_failure.cc) successfully, it's the aftermath
> > below leading to the stack trace:
> > https://github.com/torvalds/linux/blob/5a9ffb954a3933d7867f4341684a23e008d6839b/arch/x86/kernel/cpu/mce/core.c#L1101
> 
> On Sun, May 17, 2020 at 08:33:00AM -0700, Jue Wang wrote:
> > In other words, it's the *do_memory_failure -> set_mce_nospec*  flow of
> > guest MCE handler acting on the *first* MCE injection. As a result, the
> > guest panics and resets *whenever* there is an MCE injected, even when the
> > injected MCE is recoverable.
> 
> So IIUC that set_mce_nospec() thing should check whether m->addr is
> mapped first and only then mark it _uc and whatever monkey business it
> does. Not this blanket

PLease explain how a guest (that doesn't even know that it is a guest)
is going to figure out that the EPT tables (that it has no way to access)
have marked this page invalid in guest physical address space.

>   if am I a guest?
> 
> test.
> 
> Imagine a hypervisor which doesn't set X86_FEATURE_HYPERVISOR, i.e.,
> CPUID(1)[EDX, bit 31]?

Guest is going to be screwed in this case.

-Tony