linux-kernel - Re: [PATCH] x86/mm: Don't try to change poison pages to uncacheable in a guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200518134813.GC25034@zn.tnic>
Date:   Mon, 18 May 2020 15:48:13 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     Jue Wang <juew@...gle.com>
Cc:     "Luck, Tony" <tony.luck@...el.com>,
        "Williams, Dan J" <dan.j.williams@...el.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86/mm: Don't try to change poison pages to uncacheable
 in a guest

Hi,

lemme try to reply to three emails at once.

First of all, the two of you: pls do not top-post.

On Sat, May 16, 2020 at 6:52 PM Luck, Tony <tony.luck@...el.com> wrote:
> But the guest isn’t likely to do the right thing with a page fault.
> The guest just accessed a page that it knows is poisoned (VMM just told
> it once that it was poisoned). There is no reason that the VMM should
> let the guest actually touch the poison a second time. But if the guest
> does, then the guest should get the expected response. I.e. another
> machine check.

So Jue says below that the VMM has unmapped the guest page from the EPT.
So the guest cannot touch the poison anymore.

How is then possible for the guest to touch it again if the page is not
mapped anymore?

The guest should get a #PF when the page is unmapped and cause a new
page to be mapped.

On Sun, May 17, 2020 at 07:36:00AM -0700, Jue Wang wrote:
> The stack is from guest MCE handler's processing of the first MCE injected.

Aha, so you've flipped the functions order in the trace. It all starts
at

  set_mce_nospec(m->addr >> PAGE_SHIFT);

Now it makes sense.

> Note before the first MCE is injected into guest (i.e., after the host MCE
> handler successfully finished MCE handling and notified VMM via SIGBUS with
> BUS_MCEERR_AR), VMM unmaps the guest page from EPT.

Ok, good.

> The guest MCE handler finished the "normal" MCE handling and recovery
> (memory_failure() in mm/memory_failure.cc) successfully, it's the aftermath
> below leading to the stack trace:
> https://github.com/torvalds/linux/blob/5a9ffb954a3933d7867f4341684a23e008d6839b/arch/x86/kernel/cpu/mce/core.c#L1101

On Sun, May 17, 2020 at 08:33:00AM -0700, Jue Wang wrote:
> In other words, it's the *do_memory_failure -> set_mce_nospec*  flow of
> guest MCE handler acting on the *first* MCE injection. As a result, the
> guest panics and resets *whenever* there is an MCE injected, even when the
> injected MCE is recoverable.

So IIUC that set_mce_nospec() thing should check whether m->addr is
mapped first and only then mark it _uc and whatever monkey business it
does. Not this blanket

  if am I a guest?

test.

Imagine a hypervisor which doesn't set X86_FEATURE_HYPERVISOR, i.e.,
CPUID(1)[EDX, bit 31]?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette