linux-kernel - Re: [RFC PATCH 06/18] KVM: VMX: Wire up Intel MBEC enable/disable logic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aCSSptnxW7EBEzSQ@google.com>
Date: Wed, 14 May 2025 05:55:19 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Amit Shah <Amit.Shah@....com>
Cc: "jon@...anix.com" <jon@...anix.com>, "x86@...nel.org" <x86@...nel.org>, 
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, "hpa@...or.com" <hpa@...or.com>, 
	"mingo@...hat.com" <mingo@...hat.com>, "tglx@...utronix.de" <tglx@...utronix.de>, "bp@...en8.de" <bp@...en8.de>, 
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "pbonzini@...hat.com" <pbonzini@...hat.com>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 06/18] KVM: VMX: Wire up Intel MBEC enable/disable logic

On Wed, May 14, 2025, Amit Shah wrote:
> On Tue, 2025-05-13 at 06:28 -0700, Sean Christopherson wrote:
> > On Tue, May 13, 2025, Jon Kohler wrote:
> > > > On May 12, 2025, at 2:23 PM, Sean Christopherson
> > > > This is wrong and unnecessary.  As mentioned early, the input that
> > > > matters is vmcs12.  This flag should *never* be set for vmcs01.
> > > 
> > > I’ll page this back in, but I’m like 75% sure it didn’t work when I
> > > did it that way.
> > 
> > Then you had other bugs.  The control is per-VMCS and thus needs to
> > be emulated
> > as such.  Definitely holler if you get stuck, there's no need to
> > develop this in
> > complete isolation.
> 
> Looking at this from the AMD GMET POV, here's how I think support for
> this feature for a Windows guest would be implemented:
> 
> * Do not enable the GMET feature in vmcb01.  Only the Windows guest (L1
> guest) sets this bit for its own guest (L2 guest).  KVM (L0) should see
> the bit set in vmcb02 (and vmcb12).  OTOH, pass on the CPUID bit to the
> L1 guest.
> 
> * KVM needs to propagate the #NPF to Windows (instead of handling
> anything itself -- ie no shadow page table adjustments or walks
> needed).  Windows spawns an L2 guest that causes the #NPF, and Windows
> is the one that needs to consume that fault.
> 
> * KVM needs to differentiate an #NPF exit due to GMET or non-GMET
> condition -- check the CPL and U/S bits from the exit, and the NX bit
> from the PTE that faulted.  If due to GMET, propagate it to the guest.
> If not, continue handling it

Yes, but no.  KVM shouldn't need to do anything special here other than teaching
update_permission_bitmask() to understand the GMET fault case.  Ditto for MBEC.
I'd type something up, but I would quickly encounter -ENOCOFFE :-)

With the correct mmu->permissions[], permission_fault() will naturally detect
that a #NPF (or EPT Violation) from L2 due to a GMET/MBEC violation is a fault
in the nNPT/nEPT domain and route the exit to L1.

> (btw KVM MMU API question -- from the #NPF, I have the GPA of the L2
> guest.  How to go from that guest GPA to look up the NX bit for that
> page?  I skimmed and there doesn't seem to be an existing API for it -
> so is walking the tables the only solution?)

As above, KVM doesn't manually look up individual bits while handling faults.
The walk of the guest page tables (L1's NPT/EPT for this scenario) performed by
FNAME(walk_addr_generic) will gather the effective permissions in walker->pte_access,
and check for a permission_fault() after the walk is completed.