[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bz3vlh3lri7yfckdkddopwhsgvkmizhw5q6ecomgeba7q22ufp@ntu2kugiho4o>
Date: Tue, 10 Dec 2024 15:26:40 +0200
From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...nel.org>, Albert Ou <aou@...s.berkeley.edu>,
Alexei Starovoitov <ast@...nel.org>, Andrea Parri <parri.andrea@...il.com>,
Arnd Bergmann <arnd@...db.de>, Daniel Borkmann <daniel@...earbox.net>,
Eric Chan <ericchancf@...gle.com>, Jason Gunthorpe <jgg@...pe.ca>, Kai Huang <kai.huang@...el.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>, Kent Overstreet <kent.overstreet@...ux.dev>,
Palmer Dabbelt <palmer@...osinc.com>, Paul Walmsley <paul.walmsley@...ive.com>,
Russell King <linux@...linux.org.uk>, Samuel Holland <samuel.holland@...ive.com>,
Suren Baghdasaryan <surenb@...gle.com>, Yuntao Wang <ytcoode@...il.com>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org, linux-riscv@...ts.infradead.org,
Tom Lendacky <thomas.lendacky@....com>, Ashish Kalra <ashish.kalra@....com>,
"Maciej W. Rozycki" <macro@...am.me.uk>
Subject: Re: [PATCHv2 2/2] x86/mm: Make memremap(MEMREMAP_WB) map memory as
encrypted by default
On Thu, Nov 21, 2024 at 12:49:52PM +0100, Borislav Petkov wrote:
> On Tue, Nov 19, 2024 at 10:21:05AM +0200, Kirill A. Shutemov wrote:
> > Sure, we can workaround every place that touches such ranges.
>
> Every place? Which every place? I thought this is only an EISA issue?
I looked at other places where we call memremap(MEMREMAP_WB) such as
acpi_wakeup_cpu(). We actually get encrypted/private mapping for this
callsite despite __ioremap_caller() being called encrypted == false.
This happens because of IORES_MAP_ENCRYPTED check in __ioremap_caller().
So we depend on the BIOS here. The EISA problem happens because the
target memory is in !IORES_MAP_ENCRYPTED memory.
It's hard to say if any other memremap(MEMREMAP_WB) would trigger the
issue. And what will happen after next BIOS update.
> Then clearly your changelogs need to expand considerably more what we're
> *really* addressing here.
>
> > Or we can address problem at the root and make creating decrypted/shared
> > mappings explicit.
>
> What is the problem? That KVM implicitly converts memory to shared? Why does
> KVM do that an can it be fixed not to?
>
> Doesn't sound like the guest's problem.
Well, the problem is on the both sides.
VMM behaviour on such accesses is not specified in any spec. AFAIK all
current VMM implementations do this implicit conversion.
I think it has to be fixed. VMMs (not only KVM) should not silently
convert memory to shared. But VMMs cannot make memory access to go away.
The only option they have is to inject #VE instead indicating bogus
access. At this point it becomes a guest problem.
It will get fixed in VMMs naturally when TDX Connect gets enabled.
With a secure device assigned to a TD, VMM would loose the ability to
convert memory on its own. The guest would have to unlock the memory
first. This will make implicit conversion impossible.
But it also means guest should never initiate shared access without
explicit conversion. Otherwise #VE will crash it.
> Or maybe this needs a lot more explanation what we're fixing here.
>
> > Such mappings have both functional (as we see here) and security
> > implications (VMM can manipulate the guest memory range). We should not
> > create decrypted mappings by default on legacy interfaces.
>
> So we're getting closer.
>
> The changes themselves are fine but your text is missing a lot about what
> we're fixing here. When I asked, I barely scratched the surface. So can we
> elaborate here pls?
What about this:
x86/mm: Make memremap(MEMREMAP_WB) map memory as encrypted by default
Currently memremap(MEMREMAP_WB) can produce decrypted/shared mapping:
memremap(MEMREMAP_WB)
arch_memremap_wb()
ioremap_cache()
__ioremap_caller(.encrytped = false)
In such cases, the IORES_MAP_ENCRYPTED flag on the memory will determine
if the resulting mapping is encrypted or decrypted.
Creating a decrypted mapping without explicit request from the caller is
risky:
- It can inadvertently expose the guest's data and compromise the
guest.
- Accessing private memory via shared/decrypted mapping on TDX will
either trigger implicit conversion to shared or #VE (depending on
VMM implementation).
Implicit conversion is destructive: subsequent access to the same
memory via private mapping will trigger a hard-to-debug #VE crash.
The kernel already provides a way to request decrypted mapping
explicitly via the MEMREMAP_DEC flag.
Modify memremap(MEMREMAP_WB) to produce encrypted/private mapping by
default unless MEMREMAP_DEC is specified.
This change fixes the crash on kexec in TDX guests if CONFIG_EISA is
enabled.
--
Kiryl Shutsemau / Kirill A. Shutemov
Powered by blists - more mailing lists