linux-kernel - Re: [PATCH v3 13/21] KVM:VMX: Emulate reads and writes to CET MSRs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <44d59b64-716f-fa58-67ee-d66beb9132d2@intel.com>
Date:   Tue, 27 Jun 2023 11:32:44 +0800
From:   "Yang, Weijiang" <weijiang.yang@...el.com>
To:     Sean Christopherson <seanjc@...gle.com>
CC:     <pbonzini@...hat.com>, <kvm@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <peterz@...radead.org>,
        <rppt@...nel.org>, <binbin.wu@...ux.intel.com>,
        <rick.p.edgecombe@...el.com>, <john.allen@....com>,
        Sean Christopherson <sean.j.christopherson@...el.com>
Subject: Re: [PATCH v3 13/21] KVM:VMX: Emulate reads and writes to CET MSRs


On 6/27/2023 5:15 AM, Sean Christopherson wrote:
> On Mon, Jun 26, 2023, Weijiang Yang wrote:
>> On 6/24/2023 7:53 AM, Sean Christopherson wrote:
>>> On Thu, May 11, 2023, Yang Weijiang wrote:
>>> Side topic, what on earth does the SDM mean by this?!?
>>>
>>>     The linear address written must be aligned to 8 bytes and bits 2:0 must be 0
>>>     (hardware requires bits 1:0 to be 0).
>>>
>>> I know Intel retroactively changed the alignment requirements, but the above
>>> is nonsensical.  If ucode prevents writing bits 2:0, who cares what hardware
>>> requires?
>> So do I ;-/
> Can you follow-up with someone to get clarification?  If writing bit 2 with '1'
> does not #GP despite the statement that it "must be aligned", then KVM shouldn't
> injected a #GP on that case.

OK, will consult someone and get back to this thread.

>
>>>> +			return 1;
>>>> +		kvm_set_xsave_msr(msr_info);
>>>> +		break;
>>>>    	case MSR_IA32_PERF_CAPABILITIES:
>>>>    		if (data && !vcpu_to_pmu(vcpu)->version)
>>>>    			return 1;
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index b6eec9143129..2e3a39c9297c 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -13630,6 +13630,26 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size,
>>>>    }
>>>>    EXPORT_SYMBOL_GPL(kvm_sev_es_string_io);
>>>> +bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr)
>>>> +{
>>>> +	if (!kvm_cet_user_supported())
>>> This feels wrong.  KVM should differentiate between SHSTK and IBT in the host.
>>> E.g. if running in a VM with SHSTK but not IBT, or vice versa, KVM should allow
>>> writes to non-existent MSRs.
>> I don't follow you, in this case, which part KVM is on behalf of? guest or
>> user space?
> Sorry, typo.  KVM *shouldn't* allow writes to non-existent MSRs.
>
>>> I.e. this looks wrong:
>>>
>>> 	/*
>>> 	 * If SHSTK and IBT are available in KVM, clear CET user bit in
>>> 	 * kvm_caps.supported_xss so that kvm_cet_user_supported() returns
>>> 	 * false when called.
>>> 	 */
>>> 	if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
>>> 	    !kvm_cpu_cap_has(X86_FEATURE_IBT))
>>> 		kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_USER;
>> The comment is wrong, it should be "are not available in KVM". My intent is,ï¿½
>> if both features are not available in KVM, then clear the precondition bit so
>> that all dependent checks will fail quickly.
> Checking kvm_caps.supported_xss.CET_USER is worthless in 99% of the cases though.
> Unless I'm missing something, the only time it's useful is for CR4.CET, which
> doesn't differentiate between SHSTK and IBT.  For everything else that KVM cares
> about, at some point KVM needs to precisely check for SHSTK and IBT support
> anyways

I will tweak the patches and do precise checks based on the available 
features to guest.

>>> and by extension, all dependent code is also wrong.  IIRC, there's a virtualization
>>> hole, but I don't see any reason why KVM has to make the hole even bigger.
>> Do you mean the issue that both SHSTK and IBT share one control MSR? i.e.,
>> U_CET/S_CET?
> I mean that passing through PLx_SSP if the host has IBT but *not* SHSTK is wrong.

Understood.

>
>>>> +		return false;
>>>> +
>>>> +	if (msr->host_initiated)
>>>> +		return true;
>>>> +
>>>> +	if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
>>>> +	    !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
>>>> +		return false;
>>>> +
>>>> +	if (msr->index == MSR_IA32_PL3_SSP &&
>>>> +	    !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
>>> I probably asked this long ago, but if I did I since forgot.  Is it really just
>>> PL3_SSP that depends on SHSTK?  I would expect all shadow stack MSRs to depend
>>> on SHSTK.
>> All PL{0,1,2,3}_SSP plus INT_SSP_TAB msr depend on SHSTK. In patch 21, I
>> added more MSRs in this helper.
> Sure, except that patch 21 never adds handling for PL{0,1,2}_SSP.  I see:
>
> 	if (!kvm_cet_user_supported() &&
> 	    !(kvm_cpu_cap_has(X86_FEATURE_IBT) ||
> 	      kvm_cpu_cap_has(X86_FEATURE_SHSTK)))
> 		return false;
>
> 	if (msr->host_initiated)
> 		return true;
>
> 	if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
> 	    !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
> 		return false;
>
> 	/* The synthetic MSR is for userspace access only. */
> 	if (msr->index == MSR_KVM_GUEST_SSP)
> 		return false;
>
> 	if (msr->index == MSR_IA32_U_CET)
> 		return true;
>
> 	if (msr->index == MSR_IA32_S_CET)
> 		return guest_cpuid_has(vcpu, X86_FEATURE_IBT) ||
> 		       kvm_cet_kernel_shstk_supported();
>
> 	if (msr->index == MSR_IA32_INT_SSP_TAB)
> 		return guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
> 		       kvm_cet_kernel_shstk_supported();
>
> 	if (msr->index == MSR_IA32_PL3_SSP &&
> 	    !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
> 		return false;
>
> 	mask = (msr->index == MSR_IA32_PL3_SSP) ? XFEATURE_MASK_CET_USER :
> 						  XFEATURE_MASK_CET_KERNEL;
> 	return !!(kvm_caps.supported_xss & mask);
>
> Which means that KVM will allow guest accesses to PL{0,1,2}_SSP regardless of
> whether or not X86_FEATURE_SHSTK is enumerated to the guest.

Hmm, the check of X86_FEATURE_SHSTK is missing in this case.

>
> And the above is also wrong for host_initiated writes to SHSTK MSRs.  E.g. if KVM
> is running on a CPU that has IBT but not SHSTK, then userspace can write to MSRs
> that do not exist.
>
> Maybe this confusion is just a symptom of the series not providing proper
> Supervisor Shadow Stack support, but that's still a poor excuse for posting
> broken code.
>
> I suspect you tried to get too fancy.  I don't see any reason to ever care about
> kvm_caps.supported_xss beyond emulating writes to XSS itself.  Just require that
> both CET_USER and CET_KERNEL are supported in XSS to allow IBT or SHSTK, i.e. let
> X86_FEATURE_IBT and X86_FEATURE_SHSTK speak for themselves.  That way, this can
> simply be:

You're right, kvm_cet_user_supported() is overused.

Let me recap to see if I understand correctly:

1. Check both CET_USER and CET_KERNEL are supported in XSS before 
advertise SHSTK is supported

in KVM and expose it to guest, the reason is once SHSTK is exposed to 
guest, KVM should support both

modes to honor arch integrity.

2. Check CET_USER is supported before advertise IBT is supported in KVM  
and expose IBT, the reason is,

user IBT(MSR_U_CET) depends on CET_USER bit while kernel IBT(MSR_S_CET) 
doesn't.

>
> bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr)
> {
> 	if (is_shadow_stack_msr(...))
> 		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
> 			return false;
>
> 		return msr->host_initiated ||
> 		       guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
> 	}
>
> 	if (!kvm_cpu_cap_has(X86_FEATURE_IBT) &&
> 	    !kvm_cpu_cap_has(X86_FEATURE_SHSTK))
> 		return false;

Move above checks to the beginning?

>
> 	return msr->host_initiated ||
> 	       guest_cpuid_has(vcpu, X86_FEATURE_IBT) ||
> 	       guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
> }
>
>>>> + * and reload the guest fpu states before read/write xsaves-managed MSRs.
>>>> + */
>>>> +static inline void kvm_get_xsave_msr(struct msr_data *msr_info)
>>>> +{
>>>> +	fpregs_lock_and_load();
>>> KVM already has helpers that do exactly this, and they have far better names for
>>> KVM: kvm_fpu_get() and kvm_fpu_put().  Can you convert kvm_fpu_get() to
>>> fpregs_lock_and_load() and use those isntead? And if the extra consistency checks
>>> in fpregs_lock_and_load() fire, we definitely want to know, as it means we probably
>>> have bugs in KVM.
>> Do you want me to do some experiments to make sure the WARN()ï¿½ in
>> fpregs_lock_and load() would be triggered or not?
> Yes, though I shouldn't have to clarify that.  The well-documented (as of now)
> expectation is that any code that someone posts is tested, unless explicitly
> stated otherwise.  I.e. you should not have to ask if you should verify the WARN
> doesn't trigger, because you should be doing that for all code you post.

Surely I will do tests based on the change.

>
>> If no WARN() trigger, then replace fpregs_lock_and_load()/fpregs_unlock()
>> with kvm_fpu_get()/
>>
>> kvm_fpu_put()?
> Yes.