linux-kernel - Re: [PATCH v10 24/27] KVM: x86: Enable CET virtualization for VMX and advertise to userspace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0ec64962-393c-4b2d-9689-c0375d7346aa@intel.com>
Date: Tue, 7 May 2024 10:37:47 +0800
From: "Yang, Weijiang" <weijiang.yang@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
CC: <pbonzini@...hat.com>, <dave.hansen@...el.com>, <x86@...nel.org>,
	<kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<peterz@...radead.org>, <chao.gao@...el.com>, <rick.p.edgecombe@...el.com>,
	<mlevitsk@...hat.com>, <john.allen@....com>
Subject: Re: [PATCH v10 24/27] KVM: x86: Enable CET virtualization for VMX and
 advertise to userspace

On 5/7/2024 12:54 AM, Sean Christopherson wrote:
> On Mon, May 06, 2024, Weijiang Yang wrote:
>> On 5/2/2024 7:15 AM, Sean Christopherson wrote:
>>> On Sun, Feb 18, 2024, Yang Weijiang wrote:
>>>> @@ -696,6 +697,20 @@ void kvm_set_cpu_caps(void)
>>>>    		kvm_cpu_cap_set(X86_FEATURE_INTEL_STIBP);
>>>>    	if (boot_cpu_has(X86_FEATURE_AMD_SSBD))
>>>>    		kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD);
>>>> +	/*
>>>> +	 * Don't use boot_cpu_has() to check availability of IBT because the
>>>> +	 * feature bit is cleared in boot_cpu_data when ibt=off is applied
>>>> +	 * in host cmdline.
>>> I'm not convinced this is a good reason to diverge from the host kernel.  E.g.
>>> PCID and many other features honor the host setup, I don't see what makes IBT
>>> special.
>> This is mostly based on our user experience and the hypothesis for cloud
>> computing: When we evolve host kernels, we constantly encounter issues when
>> kernel IBT is on, so we have to disable kernel IBT by adding ibt=off. But we
>> need to test the CET features in VM, if we just simply refer to host boot
>> cpuid data, then IBT cannot be enabled in VM which makes CET features
>> incomplete in guest.
>>
>> I guess in cloud computing, it could run into similar dilemma. In this case,
>> the tenant cannot benefit the feature just because of host SW problem.
> Hmm, but such issues should be found before deploying a kernel to production.
>
> The one scenario that comes to mind where I can see someone wanting to disable
> IBT would be running a out-of-tree and/or third party module.

Yes, the developers may neglect IBT violations in modules/kernel components and deploy
them, in this case, host side has to either fix the issues or disable IBT.

>
>> I know currently KVM except LA57 always honors host feature configurations,
>> but in CET case, there could be divergence wrt honoring host configuration as
>> long as there's no quirk for the feature.
>>
>> But I think the issue is still open for discussion...
> I'm not totally opposed to the idea.
>
> Somewhat off-topic, the existing LA57 code upon which the IBT check is based is
> flawed, as it doesn't account for the max supported CPUID leaf.  On Intel CPUs,
> that could result in a false positive due CPUID (stupidly) returning the value
> of the last implemented CPUID leaf, no zeros.  In practice, it doesn't cause
> problems because CPUID.0x7 has been supported since forever, but it's still a
> bug.
>
> Hmm, actually, __kvm_cpu_cap_mask() has the exact same bug.  And that's much less
> theoretical, e.g. kvm_cpu_cap_init_kvm_defined() in particular is likely to cause
> problems at some point.
>
> And I really don't like that KVM open codes calls to cpuid_<reg>() for these
> "raw" features.  One option would be to and helpers to change this:
>
> 	if (cpuid_edx(7) & F(IBT))
> 		kvm_cpu_cap_set(X86_FEATURE_IBT);
>
> to this:
>
> 	if (raw_cpuid_has(X86_FEATURE_IBT))
> 		kvm_cpu_cap_set(X86_FEATURE_IBT);
>
> but I think we can do better, and harden the CPUID code in the process.  If we
> do kvm_cpu_cap_set() _before_ kvm_cpu_cap_mask(), then incorporating the raw host
> CPUID will happen automagically, as __kvm_cpu_cap_mask() will clear bits that
> aren't in host CPUID.
>
> The most obvious approach would be to simply call kvm_cpu_cap_set() before
> kvm_cpu_cap_mask(), but that's more than a bit confusing, and would open the door
> for potential bugs due to calling kvm_cpu_cap_set() after kvm_cpu_cap_mask().
> And detecting such bugs would be difficult, because there are features that KVM
> fully emulates, i.e. _must_ be stuffed after kvm_cpu_cap_mask().
>
> Instead of calling kvm_cpu_cap_set() directly, we can take advantage of the fact
> that the F() maskes are fed into kvm_cpu_cap_mask(), i.e. are naturally processed
> before the corresponding kvm_cpu_cap_mask().
>
> If we add an array to track which capabilities have been initialized, then F()
> can WARN on improper usage.  That would allow detecting bad "raw" usage, *and*
> would detect (some) scenarios where a F() is fed into the wrong leaf, e.g. if
> we added F(LA57) to CPUID_7_EDX instead of CPUID_7_ECX.
>
> #define F(name)								\
> ({									\
> 	u32 __leaf = __feature_leaf(X86_FEATURE_##name);		\
> 									\
> 	BUILD_BUG_ON(__leaf >= ARRAY_SIZE(kvm_cpu_cap_initialized));	\
> 	WARN_ON_ONCE(kvm_cpu_cap_initialized[__leaf]);			\
> 									\
> 	feature_bit(name);						\
> })
>
> /*
>   * Raw Feature - For features that KVM supports based purely on raw host CPUID,
>   * i.e. that KVM virtualizes even if the host kernel doesn't use the feature.
>   * Simply force set the feature in KVM's capabilities, raw CPUID support will
>   * be factored in by kvm_cpu_cap_mask().
>   */
> #define RAW_F(name)						\
> ({								\
> 	kvm_cpu_cap_set(X86_FEATURE_##name);			\
> 	F(name);						\
> })
>
> Assuming testing doesn't poke a hole in my idea, I'll post a small series.

Fancy enough! But I like the idea :-)

>