[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f1800b4f27554df2b2c538bdbe0a38419a231a09.camel@intel.com>
Date: Wed, 5 Feb 2025 03:51:00 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "seanjc@...gle.com" <seanjc@...gle.com>, "Xu, Min M" <min.m.xu@...el.com>
CC: "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "dave.hansen@...ux.intel.com"
<dave.hansen@...ux.intel.com>, "thomas.lendacky@....com"
<thomas.lendacky@....com>, "dionnaglaze@...gle.com" <dionnaglaze@...gle.com>,
"Wu, Binbin" <binbin.wu@...el.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "kirill.shutemov@...ux.intel.com"
<kirill.shutemov@...ux.intel.com>, "mingo@...hat.com" <mingo@...hat.com>,
"pbonzini@...hat.com" <pbonzini@...hat.com>, "tglx@...utronix.de"
<tglx@...utronix.de>, "hpa@...or.com" <hpa@...or.com>, "vkuznets@...hat.com"
<vkuznets@...hat.com>, "bp@...en8.de" <bp@...en8.de>, "jgross@...e.com"
<jgross@...e.com>, "x86@...nel.org" <x86@...nel.org>, "pgonda@...gle.com"
<pgonda@...gle.com>
Subject: Re: [PATCH 0/2] x86/kvm: Force legacy PCI hole as WB under SNP/TDX
+Min, can you comment?
3a3b12cbda ("UefiCpuPkg/MtrrLib: MtrrLibIsMtrrSupported always return FALSE in
TD-Guest") turned out to be problematic in practice.
Full thread:
https://lore.kernel.org/kvm/20250201005048.657470-1-seanjc@google.com/
On Mon, 2025-02-03 at 16:27 -0800, Sean Christopherson wrote:
> On Mon, Feb 03, 2025, Rick P Edgecombe wrote:
> > On Mon, 2025-02-03 at 12:33 -0800, Sean Christopherson wrote:
> > > > Since there is no upstream KVM TDX support yet, why isn't it an option to
> > > > still
> > > > revert the EDKII commit too? It was a relatively recent change.
> > >
> > > I'm fine with that route too, but it too is a band-aid. Relying on the
> > > *untrusted*
> > > hypervisor to essentially communicate memory maps is not a winning strategy.
> > >
> > > > To me it seems that the normal KVM MTRR support is not ideal, because it is
> > > > still lying about what it is doing. For example, in the past there was an
> > > > attempt to use UC to prevent speculative execution accesses to sensitive
> > > > data.
> > > > The KVM MTRR support only happens to work with existing guests, but not all
> > > > possible MTRR usages.
> > > >
> > > > Since diverging from the architecture creates loose ends like that, we could
> > > > instead define some other way for EDKII to communicate the ranges to the
> > > > kernel.
> > > > Like some simple KVM PV MSRs that are for communication only, and not
> > >
> > > Hard "no" to any PV solution. This isn't KVM specific, and as above, bouncing
> > > through the hypervisor to communicate information within the guest is asinine,
> > > especially for CoCo VMs.
> >
> > Hmm, right.
> >
> > So the other options could be:
> >
> > 1. Some TDX module feature to hold the ranges:
> > - Con: Not shared with AMD
> >
> > 2. Re-use MTRRs for the communication, revert changes in guest and edk2:
>
> Thinking more about how EDK2 is consumed downstream, I think reverting the EDK2
> changes is necessary regardless of what happens in the kernel. Or at the least,
> somehow communicate to EDK2 users that ingesting those changes is a bad idea
> unless the kernel has also been updated.
>
> AFAIK, Bring Your Own Firmware[*] isn't widely adopted, which means that the CSP
> is shipping the firmware. And shipping OVMF/EDK2 with the "ignores MTRRs" code
> will cause problems for guests without commit 8e690b817e38 ("x86/kvm: Override
> default caching mode for SEV-SNP and TDX"). Since the host doesn't control the
> guest kernel, there's no way to know if deploying those EDK2 changes is safe.
>
> [*] https://kvm-forum.qemu.org/2024/BYOF_-_KVM_Forum_2024_iWTioIP.pdf
>
Hmm. Since there is no upstream TDX KVM support, for it's part, I guess KVM
should still get a chance to define a cleaner solution (if there actually was a
cleaner solution). But yea, it would mean only components from after the
solution was settled could be used together for a fully working stack. And
it should probably be called out somehow. Maybe could be in the KVM TDX docs or
something.
Still seems like a thing to avoid if possible.
> > - Con: Creating more half support, when it's technically not required
> > - Con: Still bouncing through the hypervisor
>
> I assume by "Re-use MTRRs for the communication" you also mean updating the guest
> to address the "everything is UC!" flaw, otherwise another con is:
>
> - Con: Doesn't address the performance issue with TDX guests "using" UC
> memory by default (unless there's yet more enabled).
Hmm. This is quite the tangled corner.
>
> Presumably that can be accomplished by simply skipping the CR0.CD toggling, and
> doing MTRR stuff as nonrmal?
I'll have to get back to you on this one. Kirill probably could give a better
answer, but likely will not be able to follow up on this thread until next week.
>
> > - Pro: Design and code is clear
> >
> > 3. Create some new architectural definition, like a bit that means "MTRRs don't
> > actually work:
> > - Con: Takes a long time, need to get agreement
> > - Con: Still bouncing through the hypervisor
>
> Not for KVM guests. As I laid out in my bug report, it's safe to assume MTRRs
> don't actually affect the memory type when running under KVM.
>
> FWIW, PAT doesn't "work" on most KVM Intel setups either, because of misguided
> KVM code that resulted in "Ignore Guest PAT" being set in all EPTEs for the
> overwhelming majority of guests. That's not desirable long term because it
> prevents the guest from using WC (via PAT) in situations where doing so is needed
> for performance and/or correctness.
>
> > - Pro: More pure solution
>
> MTRRs "not working" is a red herring. The problem isn't that MTRRs don't work,
> it's that the kernel is (somewhat unknowingly) using MTRRs as a crutch to get the
> desired memtype for devices. E.g. for emulated MMIO, MTRRs _can't_ be virtualized,
> because there's never a valid mapping, i.e. there is no physical memory and thus
> no memtype. In other words, under KVM guests (and possibly other hypervisors),
> MTRRs end up being nothing more than a communication channel between guest firmware
> and the kernel.
Yea.
>
> The gap for CoCo VMs is that using MTRRs is undesirable because they are controlled
> by the untrusted host. But that's largely a future problem, unless someone has a
> clever way to fix the kernel mess.
>
>
Yea, I wondered about that too. I imagine the thinking was that since it is only
controlling shared memory, it can be untrusted.
And I guess the solution in this patchset is hypothetically a bit more locked
down in that respect.
Powered by blists - more mailing lists