[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251030172836.5ys2wag3dax5fmwk@desk>
Date: Thu, 30 Oct 2025 10:28:36 -0700
From: Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
To: Brendan Jackman <jackmanb@...gle.com>
Cc: Thomas Gleixner <tglx@...utronix.de>, Borislav Petkov <bp@...en8.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	Ingo Molnar <mingo@...hat.com>,
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
	"H. Peter Anvin" <hpa@...or.com>,
	Sean Christopherson <seanjc@...gle.com>,
	Paolo Bonzini <pbonzini@...hat.com>, linux-kernel@...r.kernel.org,
	kvm@...r.kernel.org, Tao Zhang <tao1.zhang@...el.com>,
	Jim Mattson <jmattson@...gle.com>
Subject: Re: [PATCH 3/3] x86/mmio: Unify VERW mitigation for guests
On Thu, Oct 30, 2025 at 12:52:12PM +0000, Brendan Jackman wrote:
> On Wed Oct 29, 2025 at 9:26 PM UTC, Pawan Gupta wrote:
> > When a system is only affected by MMIO Stale Data, VERW mitigation is
> > currently handled differently than other data sampling attacks like
> > MDS/TAA/RFDS, that do the VERW in asm. This is because for MMIO Stale Data,
> > VERW is needed only when the guest can access host MMIO, this was tricky to
> > check in asm.
> >
> > Refactoring done by:
> >
> >   83ebe7157483 ("KVM: VMX: Apply MMIO Stale Data mitigation if KVM maps
> >   MMIO into the guest")
> >
> > now makes it easier to execute VERW conditionally in asm based on
> > VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO.
> >
> > Unify MMIO Stale Data mitigation with other VERW-based mitigations and only
> > have single VERW callsite in __vmx_vcpu_run(). Remove the now unnecessary
> > call to x86_clear_cpu_buffer() in vmx_vcpu_enter_exit().
> >
> > This also untangles L1D Flush and MMIO Stale Data mitigation. Earlier, an
> > L1D Flush would skip the VERW for MMIO Stale Data. Now, both the
> > mitigations are independent of each other. Although, this has little
> > practical implication since there are no CPUs that are affected by L1TF and
> > are *only* affected by MMIO Stale Data (i.e. not affected by MDS/TAA/RFDS).
> > But, this makes the code cleaner and easier to maintain.
> >
> > Signed-off-by: Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
> > ---
> >  arch/x86/kvm/vmx/run_flags.h | 12 ++++++------
> >  arch/x86/kvm/vmx/vmenter.S   |  5 +++++
> >  arch/x86/kvm/vmx/vmx.c       | 26 ++++++++++----------------
> >  3 files changed, 21 insertions(+), 22 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx/run_flags.h b/arch/x86/kvm/vmx/run_flags.h
> > index 2f20fb170def8b10c8c0c46f7ba751f845c19e2c..004fe1ca89f05524bf3986540056de2caf0abbad 100644
> > --- a/arch/x86/kvm/vmx/run_flags.h
> > +++ b/arch/x86/kvm/vmx/run_flags.h
> > @@ -2,12 +2,12 @@
> >  #ifndef __KVM_X86_VMX_RUN_FLAGS_H
> >  #define __KVM_X86_VMX_RUN_FLAGS_H
> >  
> > -#define VMX_RUN_VMRESUME_SHIFT				0
> > -#define VMX_RUN_SAVE_SPEC_CTRL_SHIFT			1
> > -#define VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO_SHIFT	2
> > +#define VMX_RUN_VMRESUME_SHIFT			0
> > +#define VMX_RUN_SAVE_SPEC_CTRL_SHIFT		1
> > +#define VMX_RUN_CLEAR_CPU_BUFFERS_SHIFT		2
> >  
> > -#define VMX_RUN_VMRESUME			BIT(VMX_RUN_VMRESUME_SHIFT)
> > -#define VMX_RUN_SAVE_SPEC_CTRL			BIT(VMX_RUN_SAVE_SPEC_CTRL_SHIFT)
> > -#define VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO	BIT(VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO_SHIFT)
> > +#define VMX_RUN_VMRESUME		BIT(VMX_RUN_VMRESUME_SHIFT)
> > +#define VMX_RUN_SAVE_SPEC_CTRL		BIT(VMX_RUN_SAVE_SPEC_CTRL_SHIFT)
> > +#define VMX_RUN_CLEAR_CPU_BUFFERS	BIT(VMX_RUN_CLEAR_CPU_BUFFERS_SHIFT)
> >  
> >  #endif /* __KVM_X86_VMX_RUN_FLAGS_H */
> > diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
> > index 0dd23beae207795484150698d1674dc4044cc520..ec91f4267eca319ffa8e6079887e8dfecc7f96d8 100644
> > --- a/arch/x86/kvm/vmx/vmenter.S
> > +++ b/arch/x86/kvm/vmx/vmenter.S
> > @@ -137,6 +137,9 @@ SYM_FUNC_START(__vmx_vcpu_run)
> >  	/* Load @regs to RAX. */
> >  	mov (%_ASM_SP), %_ASM_AX
> >  
> > +	/* jz .Lskip_clear_cpu_buffers below relies on this */
> > +	test $VMX_RUN_CLEAR_CPU_BUFFERS, %ebx
> > +
> >  	/* Check if vmlaunch or vmresume is needed */
> >  	bt   $VMX_RUN_VMRESUME_SHIFT, %ebx
> >  
> > @@ -160,6 +163,8 @@ SYM_FUNC_START(__vmx_vcpu_run)
> >  	/* Load guest RAX.  This kills the @regs pointer! */
> >  	mov VCPU_RAX(%_ASM_AX), %_ASM_AX
> >  
> > +	/* Check EFLAGS.ZF from the VMX_RUN_CLEAR_CPU_BUFFERS bit test above */
> > +	jz .Lskip_clear_cpu_buffers
> 
> Hm, it's a bit weird that we have the "alternative" inside
> VM_CLEAR_CPU_BUFFERS, but then we still keep the test+jz
> unconditionally. 
Exactly, but it is tricky to handle the below 2 cases in asm:
1. MDS -> Always do VM_CLEAR_CPU_BUFFERS
2. MMIO -> Do VM_CLEAR_CPU_BUFFERS only if guest can access host MMIO
In th MMIO case, one guest may have access to host MMIO while another may
not. Alternatives alone can't handle this case as they patch code at boot
which is then set in stone. One way is to move the conditional inside
VM_CLEAR_CPU_BUFFERS that gets a flag as an arguement.
> If we really want to super-optimise the no-mitigations-needed case,
> shouldn't we want to avoid the conditional in the asm if it never
> actually leads to a flush?
Ya, so effectively, have VM_CLEAR_CPU_BUFFERS alternative spit out
conditional VERW when affected by MMIO_only, otherwise an unconditional
VERW.
> On the other hand, if we don't mind a couple of extra instructions,
> shouldn't we be fine with just having the whole asm code based solely
> on VMX_RUN_CLEAR_CPU_BUFFERS and leaving the
> X86_FEATURE_CLEAR_CPU_BUF_VM to the C code?
That's also an option.
> I guess the issue is that in the latter case we'd be back to having
> unnecessary inconsistency with AMD code while in the former case... well
> that would just be really annoying asm code - am I on the right
> wavelength there? So I'm not necessarily asking for changes here, just
> probing in case it prompts any interesting insights on your side.
> 
> (Also, maybe this test+jz has a similar cost to the nops that the
> "alternative" would inject anyway...?)
Likely yes. test+jz is a necessary evil that is needed for MMIO Stale Data
for different per-guest handling.
Powered by blists - more mailing lists
 
