[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aCSSptnxW7EBEzSQ@google.com>
Date: Wed, 14 May 2025 05:55:19 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Amit Shah <Amit.Shah@....com>
Cc: "jon@...anix.com" <jon@...anix.com>, "x86@...nel.org" <x86@...nel.org>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, "hpa@...or.com" <hpa@...or.com>,
"mingo@...hat.com" <mingo@...hat.com>, "tglx@...utronix.de" <tglx@...utronix.de>, "bp@...en8.de" <bp@...en8.de>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 06/18] KVM: VMX: Wire up Intel MBEC enable/disable logic
On Wed, May 14, 2025, Amit Shah wrote:
> On Tue, 2025-05-13 at 06:28 -0700, Sean Christopherson wrote:
> > On Tue, May 13, 2025, Jon Kohler wrote:
> > > > On May 12, 2025, at 2:23 PM, Sean Christopherson
> > > > This is wrong and unnecessary. As mentioned early, the input that
> > > > matters is vmcs12. This flag should *never* be set for vmcs01.
> > >
> > > I’ll page this back in, but I’m like 75% sure it didn’t work when I
> > > did it that way.
> >
> > Then you had other bugs. The control is per-VMCS and thus needs to
> > be emulated
> > as such. Definitely holler if you get stuck, there's no need to
> > develop this in
> > complete isolation.
>
> Looking at this from the AMD GMET POV, here's how I think support for
> this feature for a Windows guest would be implemented:
>
> * Do not enable the GMET feature in vmcb01. Only the Windows guest (L1
> guest) sets this bit for its own guest (L2 guest). KVM (L0) should see
> the bit set in vmcb02 (and vmcb12). OTOH, pass on the CPUID bit to the
> L1 guest.
>
> * KVM needs to propagate the #NPF to Windows (instead of handling
> anything itself -- ie no shadow page table adjustments or walks
> needed). Windows spawns an L2 guest that causes the #NPF, and Windows
> is the one that needs to consume that fault.
>
> * KVM needs to differentiate an #NPF exit due to GMET or non-GMET
> condition -- check the CPL and U/S bits from the exit, and the NX bit
> from the PTE that faulted. If due to GMET, propagate it to the guest.
> If not, continue handling it
Yes, but no. KVM shouldn't need to do anything special here other than teaching
update_permission_bitmask() to understand the GMET fault case. Ditto for MBEC.
I'd type something up, but I would quickly encounter -ENOCOFFE :-)
With the correct mmu->permissions[], permission_fault() will naturally detect
that a #NPF (or EPT Violation) from L2 due to a GMET/MBEC violation is a fault
in the nNPT/nEPT domain and route the exit to L1.
> (btw KVM MMU API question -- from the #NPF, I have the GPA of the L2
> guest. How to go from that guest GPA to look up the NX bit for that
> page? I skimmed and there doesn't seem to be an existing API for it -
> so is walking the tables the only solution?)
As above, KVM doesn't manually look up individual bits while handling faults.
The walk of the guest page tables (L1's NPT/EPT for this scenario) performed by
FNAME(walk_addr_generic) will gather the effective permissions in walker->pte_access,
and check for a permission_fault() after the walk is completed.
Powered by blists - more mailing lists