linux-kernel - Re: [PATCH] x86/bugs: Adjust SRSO mitigation to new features

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20241105123416.GBZyoQyAoUmZi9eMkk@fat_crate.local>
Date: Tue, 5 Nov 2024 13:34:16 +0100
From: Borislav Petkov <bp@...en8.de>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Borislav Petkov <bp@...nel.org>, X86 ML <x86@...nel.org>,
	Josh Poimboeuf <jpoimboe@...hat.com>,
	Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
	kvm@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86/bugs: Adjust SRSO mitigation to new features

On Mon, Nov 04, 2024 at 04:57:20PM -0800, Sean Christopherson wrote:
> scripts/get_maintainer.pl :-)

That's what I used but I pruned the list.

Why, did I miss anyone?
 
> It's not strictly KVM module load, it's when KVM enables virtualization.

Yeah, the KVM CPU hotplug callback.

> E.g. if userspace clears enable_virt_at_load,

/me reads the documentation on that...

Intersting :-)

Put all the work possible in the module load so that VM startup is minimal.

> the MSR will be toggled every time the number of VMs goes from 0=>1 and
> 1=>0.

I guess that's fine. 

> But why do this in KVM?  E.g. why not set-and-forget in init_amd_zen4()?

Because there's no need to impose an unnecessary - albeit small - perf impact
on users who don't do virt.

I'm currently gravitating towards the MSR toggling thing, i.e., only when the
VMs number goes 0=>1 but I'm not sure. If udev rules *always* load kvm.ko then
yes, the toggling thing sounds better. I.e., set it only when really needed.

> Shouldn't these be two separate patches?  AFAICT, while the two are related,
> there are no strict dependencies between SRSO_USER_KERNEL_NO and
> SRSO_MSR_FIX.

Meh, I can split them if you really want me to.

> If the expectation is that X86_FEATURE_SRSO_USER_KERNEL_NO will only ever come
> from hardware, i.e. won't be force-set by the kernel, then I would prefer to set
> the bit in the "standard" way
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 41786b834b16..eb65336c2168 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -794,7 +794,7 @@ void kvm_set_cpu_caps(void)
>         kvm_cpu_cap_mask(CPUID_8000_0021_EAX,
>                 F(NO_NESTED_DATA_BP) | F(LFENCE_RDTSC) | 0 /* SmmPgCfgLock */ |
>                 F(NULL_SEL_CLR_BASE) | F(AUTOIBRS) | 0 /* PrefetchCtlMsr */ |
> -               F(WRMSR_XX_BASE_NS)
> +               F(WRMSR_XX_BASE_NS) | F(SRSO_USER_KERNEL_NO)

Ok, sure, ofc.

>         );
>  
>         kvm_cpu_cap_check_and_set(X86_FEATURE_SBPB);
> 
> The kvm_cpu_cap_check_and_set() trickery is necessary only for features that are
> force-set by the kernel, in order to avoid kvm_cpu_cap_mask()'s masking of the
> features by actual CPUID.  I'm trying to clean things up to make that more obvious;
> hopefully that'll land in 6.14[*].

Oh please. It took me a while to figure out what each *cap* function is for so
yeah, cleanup would be nice theere.

> And advertising X86_FEATURE_SRSO_USER_KERNEL_NO should also be a separate patch,
> no?  I.e. 
> 
>  1. Use SRSO_USER_KERNEL_NO in the host
>  2. Update KVM to advertise SRSO_USER_KERNEL_NO to userspace, i.e. let userspace
>     know that it can be enumerate to the guest.
>  3. Add support for SRSO_MSR_FIX.

Sure, I can split. I'm lazy and all but ok... :-P

> [*] https://lore.kernel.org/all/20240517173926.965351-49-seanjc@google.com

Cool.

> 
> >  	kvm_cpu_cap_check_and_set(X86_FEATURE_SRSO_NO);
> >  
> >  	kvm_cpu_cap_init_kvm_defined(CPUID_8000_0022_EAX,
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 9df3e1e5ae81..03f29912a638 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -608,6 +608,9 @@ static void svm_disable_virtualization_cpu(void)
> >  	kvm_cpu_svm_disable();
> >  
> >  	amd_pmu_disable_virt();
> > +
> > +	if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
> > +		msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> 
> I don't like assuming the state of hardware.  E.g. if MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT
> was already set, then KVM shouldn't clear it.

Right, I don't see that happening tho. If we have to sync the toggling of this
bit between different places, we'll have to do some dance but so far its only
user is KVM.

> KVM's usual method of restoring host MSRs is to snapshot the MSR into
> "struct kvm_host_values" on module load, and then restore from there as
> needed.  But that assumes all CPUs have the same value, which might not be
> the case here?

Yes, the default value is 0 out of reset and it should be set on each logical
CPU whenever we run VMs on it. I'd love to make it part of the VMRUN microcode
but... :-)

> All that said, I'd still prefer that MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT is set
> during boot, unless there's a good reason not to do so.

Yeah, unnecessary penalty on machines not running virt.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette