[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aRTubGCENf2oypeL@google.com>
Date: Wed, 12 Nov 2025 12:30:36 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>, Josh Poimboeuf <jpoimboe@...nel.org>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
Brendan Jackman <jackmanb@...gle.com>
Subject: Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter
assembly via ALTERNATIVES_2
On Wed, Nov 12, 2025, Borislav Petkov wrote:
> On Wed, Nov 12, 2025 at 09:15:00AM -0800, Sean Christopherson wrote:
> > On Wed, Nov 12, 2025, Borislav Petkov wrote:
> > > So this VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO bit gets set here:
> > >
> > > if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_MMIO) &&
> > > kvm_vcpu_can_access_host_mmio(&vmx->vcpu))
> > > flags |= VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO;
> > >
> > > So how static and/or dynamic is this?
> >
> > kvm_vcpu_can_access_host_mmio() is very dynamic. It can be different between
> > vCPUs in a VM, and can even change on back-to-back runs of the same vCPU.
>
> Hmm, strange. Because looking at those things there:
>
> root->has_mapped_host_mmio and vcpu->kvm->arch.has_mapped_host_mmio
>
> they both read like something that a guest would set up once and that's it.
> But what do I know...
They're set based on what memory is mapped into the KVM-controlled page tables,
e.g. into the EPT/NPT tables, that will be used by the vCPU for that VM-Enter.
root->has_mapped_host_mmio is per page table. vcpu->kvm->arch.has_mapped_host_mmio
exists because of nastiness related to shadow paging; for all intents and purposes,
I would just mentally ignore that one.
> > > IOW, can you stick this into a simple variable which is unconditionally
> > > updated and you can use it in X86_FEATURE_CLEAR_CPU_BUF_MMIO case and
> > > otherwise it simply remains unused?
> >
> > Can you elaborate? I don't think I follow what you're suggesting.
>
> So I was thinking if you could set a per-guest variable in
> C - vmx_per_guest_clear_per_mmio or so and then test it in asm:
>
> testb $1,vmx_per_guest_clear_per_mmio(%rip)
> jz .Lskip_clear_cpu_buffers;
> CLEAR_CPU_BUFFERS_SEQ;
>
> .Lskip_clear_cpu_buffers:
>
> gcc -O3 suggests also
>
> cmpb $0x0,vmx_per_guest_clear_per_mmio(%rip)
>
> which is the same insn size...
>
> The idea is to get rid of this first asm stashing things and it'll be a bit
> more robust, I'd say.
VMX "needs" to abuse RFLAGS no matter what, because RFLAGS is the only register
that's available at the time of VMLAUNCH/VMRESUME. On Intel, only RSP and
RFLAGS are context switched via the VMCS, all other GPRs need to be context
switch by software. Which is why I didn't balk at Pawan's idea to use RFLAGS.ZF
to track whether or not a VERW for MMIO is needed.
Hmm, actually, @flags is already on the stack because it's needed at VM-Exit.
Using EBX was a holdover from the conversion from inline asm to "proper" asm,
e.g. from commit 77df549559db ("KVM: VMX: Pass @launched to the vCPU-run asm via
standard ABI regs").
Oooh, and if we stop using bt+RFLAGS.CF, then we drop the annoying SHIFT definitions
in arch/x86/kvm/vmx/run_flags.h.
Very lightly tested at this point, but I think this can all be simplified to
/*
* Note, ALTERNATIVE_2 works in reverse order. If CLEAR_CPU_BUF_VM is
* enabled, do VERW unconditionally. If CPU_BUF_VM_MMIO is enabled,
* check @flags to see if the vCPU has access to host MMIO, and do VERW
* if so. Else, do nothing (no mitigations needed/enabled).
*/
ALTERNATIVE_2 "", \
__stringify(testl $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, WORD_SIZE(%_ASM_SP); \
jz .Lskip_clear_cpu_buffers; \
VERW; \
.Lskip_clear_cpu_buffers:), \
X86_FEATURE_CLEAR_CPU_BUF_VM_MMIO, \
__stringify(VERW), X86_FEATURE_CLEAR_CPU_BUF_VM
/* Check if vmlaunch or vmresume is needed */
testl $VMX_RUN_VMRESUME, WORD_SIZE(%_ASM_SP)
jz .Lvmlaunch
> And you don't rely on registers...
>
> and when I say that, I now realize this is 32-bit too and you don't want to
> touch regs - that's why you're stashing it - and there's no rip-relative on
> 32-bit...
>
> I dunno - it might get hairy but I would still opt for a different solution
> instead of this fragile stashing in ZF. You could do a function which pushes
> and pops a scratch register where you put the value, i.e., you could do
>
> push %reg
> mov var, %reg
> test or cmp ...
> ...
> jz skip...
> skip:
> pop %reg
>
> It is still all together in one place instead of spreading it around like
> that.
FWIW, all GPRs except RSP are off limits. But as above, getting at @flags via
RSP is trivial.
Powered by blists - more mailing lists