[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200518165500.GD25034@zn.tnic>
Date: Mon, 18 May 2020 18:55:00 +0200
From: Borislav Petkov <bp@...en8.de>
To: "Luck, Tony" <tony.luck@...el.com>
Cc: Jue Wang <juew@...gle.com>,
"Williams, Dan J" <dan.j.williams@...el.com>,
"x86@...nel.org" <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86/mm: Don't try to change poison pages to uncacheable
in a guest
On Mon, May 18, 2020 at 08:36:25AM -0700, Luck, Tony wrote:
> The VMM gets the page fault (because the unmapping of the guest
> physical address is at the VMM EPT level). The VMM can't map a new
> page into that guest physical address because it has no way to
> replace the contents of the old page. The VMM could pass the #PF
> to the guest, but that would just confuse the guest (its page tables
> all say that the page is still valid). In this particular case the
> page is part of the 1:1 kernel map. So the kernel will OOPS (I think).
...
> PLease explain how a guest (that doesn't even know that it is a guest)
> is going to figure out that the EPT tables (that it has no way to access)
> have marked this page invalid in guest physical address space.
So somewhere BUS_MCEERR_AR was mentioned. So I'm assuming the error
severity was "action required". What does happen in the kernel, on
baremetal, with an AR error in kernel space, i.e., kernel memory?
If we can't fixup the exception, we die.
So why should the guest behave any differently?
Now, if you want for the guest to be more "robust" and handle that
thing, fine. But then you'd need an explicit way to tell the guest
kernel: "you've just had an MCE and I unmapped the page" so that the
guest kernel can figure out what do to. Even if it means, to panic.
I.e., signal in an explicit way that EPT violation Jue is talking about
in the other mail.
You can inject a #PF or better yet the *first* MCE which is being
injected should say with a bit somehwere "I unmapped the address in
m->addr". So that the guest kernel can handle that properly and know
what *exactly* it is getting an MCE for.
What I don't like is the "am I running as a guest" check. Because
someone else would come later and say, err, I'm not virtualizing this
portion of MCA either, lemme add another "am I guest" check.
Sure, it is a lot easier but when stuff like that starts spreading
around in the MCE code, then we can just as well disable MCE when
virtualized altogether. It would be a lot easier for everybody.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists