[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aQUtwsfxEsUi4us0@google.com>
Date: Fri, 31 Oct 2025 14:44:34 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Brendan Jackman <jackmanb@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>, Peter Zijlstra <peterz@...radead.org>,
Josh Poimboeuf <jpoimboe@...nel.org>, kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
Subject: Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter
assembly via ALTERNATIVES_2
On Fri, Oct 31, 2025, Brendan Jackman wrote:
> On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> > Rework the handling of the MMIO Stale Data mitigation to clear CPU buffers
> > immediately prior to VM-Enter, i.e. in the same location that KVM emits a
> > VERW for unconditional (at runtime) clearing. Co-locating the code and
> > using a single ALTERNATIVES_2 makes it more obvious how VMX mitigates the
> > various vulnerabilities.
> >
> > Deliberately order the alternatives as:
> >
> > 0. Do nothing
> > 1. Clear if vCPU can access MMIO
> > 2. Clear always
> >
> > since the last alternative wins in ALTERNATIVES_2(), i.e. so that KVM will
> > honor the strictest mitigation (always clear CPU buffers) if multiple
> > mitigations are selected. E.g. even if the kernel chooses to mitigate
> > MMIO Stale Data via X86_FEATURE_CLEAR_CPU_BUF_MMIO, some other mitigation
> > may enable X86_FEATURE_CLEAR_CPU_BUF_VM, and that other thing needs to win.
> >
> > Note, decoupling the MMIO mitigation from the L1TF mitigation also fixes
> > a mostly-benign flaw where KVM wouldn't do any clearing/flushing if the
> > L1TF mitigation is configured to conditionally flush the L1D, and the MMIO
> > mitigation but not any other "clear CPU buffers" mitigation is enabled.
> > For that specific scenario, KVM would skip clearing CPU buffers for the
> > MMIO mitigation even though the kernel requested a clear on every VM-Enter.
> >
> > Note #2, the flaw goes back to the introduction of the MDS mitigation. The
> > MDS mitigation was inadvertently fixed by commit 43fb862de8f6 ("KVM/VMX:
> > Move VERW closer to VMentry for MDS mitigation"), but previous kernels
> > that flush CPU buffers in vmx_vcpu_enter_exit() are affected (though it's
> > unlikely the flaw is meaningfully exploitable even older kernels).
> >
> > Fixes: 650b68a0622f ("x86/kvm/vmx: Add MDS protection when L1D Flush is not active")
> > Suggested-by: Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
> > Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> > ---
> > arch/x86/kvm/vmx/vmenter.S | 14 +++++++++++++-
> > arch/x86/kvm/vmx/vmx.c | 13 -------------
> > 2 files changed, 13 insertions(+), 14 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
> > index 1f99a98a16a2..61a809790a58 100644
> > --- a/arch/x86/kvm/vmx/vmenter.S
> > +++ b/arch/x86/kvm/vmx/vmenter.S
> > @@ -71,6 +71,7 @@
> > * @regs: unsigned long * (to guest registers)
> > * @flags: VMX_RUN_VMRESUME: use VMRESUME instead of VMLAUNCH
> > * VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
> > + * VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO: vCPU can access host MMIO
> > *
> > * Returns:
> > * 0 on VM-Exit, 1 on VM-Fail
> > @@ -137,6 +138,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> > /* Load @regs to RAX. */
> > mov (%_ASM_SP), %_ASM_AX
> >
> > + /* Stash "clear for MMIO" in EFLAGS.ZF (used below). */
> > + ALTERNATIVE_2 "", \
> > + __stringify(test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx), \
> > + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> > + "", X86_FEATURE_CLEAR_CPU_BUF_VM
>
> Ah, so this ALTERNATIVE_2 (instead of just an ALTERNATIVE that checks
> CLEAR_CPU_BUF_MMIO) is really about avoiding the flags needing to be
> mutually exclusive?
Yeah, more or less. More specifically, I want to keep the X vs. Y logic in one
place (well, two if you count both ALTERNATIVE_2 flows), so that in generaly,
from KVM's perspective, the mitigations are handled as independent things. E.g.
even if CLEAR_CPU_BUF_VM and CLEAR_CPU_BUF_MMIO are mutually exclusive within
the kernel (and it's not clear to me that that's 100% guaranteed), I want to
limit how much of KVM assumes they are exclusive. Partly to avoid "oops, we
forgot to mitigate that thing you care about", partly so that reading code like
the setting of VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO doesn't require understanding
the relationship between CLEAR_CPU_BUF_VM and CLEAR_CPU_BUF_MMIO.
> if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_MMIO) &&
> !cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_VM))
> test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx
>
> ... right? This is a good idea but I think it warrants a comment to
> capture the intent, without having the commit message in short-term
> memory I'd have struggled with this code, I think.
>
> > /* Check if vmlaunch or vmresume is needed */
> > bt $VMX_RUN_VMRESUME_SHIFT, %ebx
> >
> > @@ -161,7 +168,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> > mov VCPU_RAX(%_ASM_AX), %_ASM_AX
> >
> > /* Clobbers EFLAGS.ZF */
> > - VM_CLEAR_CPU_BUFFERS
> > + ALTERNATIVE_2 "", \
> > + __stringify(jz .Lskip_clear_cpu_buffers; \
>
> Maybe I'm just an asm noob
Nah, all of this is definitely playing on hard mode. I'm just thankful we don't
have to deal with the horrors of KVM doing all of this in inline asm. :-D
> I was very impressed by this trick of using CF and ZF together like this!)
> but I think it's helpful to have the comment like the jnc has below, and
> Pawan had in his version, to really make the test->jz dependency obvious,
> since the two instructions are quite far apart.
>
>
> > + CLEAR_CPU_BUFFERS_SEQ; \
> > + .Lskip_clear_cpu_buffers:), \
> > + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> > + __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
>
> Sorry I'm really nitpicking but I think it's justified for asm
> readability...
>
> It's a bit unfortunate that one branch says
> CLEAR_CPU_BUFFERS_SEQ and the other says __CLEAR_CPU_BUFFERS. With the
> current code I think it would be more readable to jut have
> __stringify(CLEAR_CPU_BUFFERS_SEQ) in the CLEAR_CPU_BUF_VM case, then
> you don't have to mentally expand the macro to see how the two branches
> actually differ.
No preference here (assuming I understand what you're asking).
Is this better?
/*
* Note, this sequence consumes *and* clobbers EFLAGS.ZF. The MMIO
* mitigations uses ZF to track whether or not the vCPU has access to
* host MMIO (see above), and VERW (the instruction microcode hijacks
* to clear CPU buffers) writes ZF.
*/
ALTERNATIVE_2 "", \
__stringify(jz .Lskip_clear_cpu_buffers; \
CLEAR_CPU_BUFFERS_SEQ; \
.Lskip_clear_cpu_buffers:), \
X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
__stringify(CLEAR_CPU_BUFFERS_SEQ), X86_FEATURE_CLEAR_CPU_BUF_VM
Powered by blists - more mailing lists