linux-kernel - Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Z36zWVBOiBF4g-mW@google.com>
Date: Wed, 8 Jan 2025 09:18:17 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Borislav Petkov <bp@...nel.org>, X86 ML <x86@...nel.org>, Paolo Bonzini <pbonzini@...hat.com>, 
	Josh Poimboeuf <jpoimboe@...hat.com>, Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>, 
	KVM <kvm@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 3/4] x86/bugs: KVM: Add support for SRSO_MSR_FIX

On Wed, Jan 08, 2025, Borislav Petkov wrote:
> > And do you know what 0xd23f corresponds to?
> 
> How's that:
> 
> $ objdump -D arch/x86/kvm/kvm.ko
> ...
> 000000000000d1a0 <kvm_vcpu_halt>:
>     d1a0:       e8 00 00 00 00          call   d1a5 <kvm_vcpu_halt+0x5>
>     d1a5:       55                      push   %rbp
>     ...
> 
>     d232:       e8 09 93 ff ff          call   6540 <kvm_vcpu_check_block>
>     d237:       85 c0                   test   %eax,%eax
>     d239:       0f 88 f6 01 00 00       js     d435 <kvm_vcpu_halt+0x295>
>     d23f:       f3 90                   pause
>     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
>     d241:       e8 00 00 00 00          call   d246 <kvm_vcpu_halt+0xa6>
>     d246:       48 89 c3                mov    %rax,%rbx
>     d249:       e8 00 00 00 00          call   d24e <kvm_vcpu_halt+0xae>
>     d24e:       84 c0                   test   %al,%al
> 
> 
> Which makes sense :-)

Ooh, it's just the MSR writes that increased.  I misinterpreted the profile
statement and thought that something in KVM was jumping from ~0% to 4.31%.  If
the cost really is just this:

   1.66%  qemu-system-x86  [kernel.kallsyms]        [k] native_write_msr
   1.50%  qemu-system-x86  [kernel.kallsyms]        [k] native_write_msr_safe

vs

   1.01%  qemu-system-x86  [kernel.kallsyms]        [k] native_write_msr
   0.81%  qemu-system-x86  [kernel.kallsyms]        [k] native_write_msr_safe

then my vote is to go with the user_return approach.  It's unfortunate that
restoring full speculation may be delayed until a CPU exits to userspace or KVM
is unloaded, but given that enable_virt_at_load is enabled by default, in practice
it's likely still far better than effectively always running the host with reduced
speculation.

> > Yeah, especially if this is all an improvement over the existing mitigation.
> > Though since it can impact non-virtualization workloads, maybe it should be a
> > separately selectable mitigation?  I.e. not piggybacked on top of ibpb-vmexit?
> 
> Well, ibpb-on-vmexit is your typical cloud provider scenario where you address
> the VM/VM attack vector by doing an IBPB on VMEXIT. 

No?  svm_vcpu_load() emits IBPB when switching VMCBs, i.e. when switching between
vCPUs that may live in separate security contexts.  That IBPB is skipped when
X86_FEATURE_IBPB_ON_VMEXIT is enabled, because the host is trusted to not attack
its guests.

> This SRSO_MSR_FIX thing protects the *host* from a malicious guest so you
> need both enabled for full protection on the guest/host vector.

If reducing speculation protects the host, why wouldn't that also protect other
guests?  The CPU needs to bounce through the host before enterring a different
guest.

And if for some reason reducing speculation doesn't suffice, wouldn't it be
better to fall back to doing IBPB only when switching VMCBs?