linux-kernel - Re: [PATCH v3 20/21] KVM:x86: Enable kernel IBT support for guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZJn6F93Ed/i18BL5@google.com>
Date:   Mon, 26 Jun 2023 13:50:31 -0700
From:   Sean Christopherson <seanjc@...gle.com>
To:     Weijiang Yang <weijiang.yang@...el.com>
Cc:     pbonzini@...hat.com, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, peterz@...radead.org,
        rppt@...nel.org, binbin.wu@...ux.intel.com,
        rick.p.edgecombe@...el.com, john.allen@....com
Subject: Re: [PATCH v3 20/21] KVM:x86: Enable kernel IBT support for guest

On Mon, Jun 26, 2023, Weijiang Yang wrote:
> 
> On 6/24/2023 8:03 AM, Sean Christopherson wrote:
> > > @@ -7322,6 +7331,19 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
> > >   	kvm_wait_lapic_expire(vcpu);
> > > +	/*
> > > +	 * Save host MSR_IA32_S_CET so that it can be reloaded at vm_exit.
> > > +	 * No need to save the other two vmcs fields as supervisor SHSTK
> > > +	 * are not enabled on Intel platform now.
> > > +	 */
> > > +	if (IS_ENABLED(CONFIG_X86_KERNEL_IBT) &&
> > > +	    (vm_exit_controls_get(vmx) & VM_EXIT_LOAD_CET_STATE)) {
> > > +		u64 msr;
> > > +
> > > +		rdmsrl(MSR_IA32_S_CET, msr);
> > Reading the MSR on every VM-Enter can't possibly be necessary.  At the absolute
> > minimum, this could be moved outside of the fastpath; if the kernel modifies S_CET
> > from NMI context, KVM is hosed.  And *if* S_CET isn't static post-boot, this can
> > be done in .prepare_switch_to_guest() so long as S_CET isn't modified from IRQ
> > context.
> 
> Agree with you.
> 
> > 
> > But unless mine eyes deceive me, S_CET is only truly modified during setup_cet(),
> > i.e. is static post boot, which means it can be read once at KVM load time, e.g.
> > just like host_efer.
> 
> I think handling S_CET like host_efer from usage perspective is possible
> given currently only
> 
> kernel IBT is enabled in kernel, I'll remove the code and initialize the
> vmcs field once like host_efer.
> 
> > 
> > The kernel does save/restore IBT when making BIOS calls, but if KVM is running a
> > vCPU across a BIOS call then we've got bigger issues.
> 
> What's the problem you're referring to?

I was pointing out that S_CET isn't strictly constant, as it's saved/modified/restored
by ibt_save() + ibt_restore().  But KVM should never run between those paired
functions, so from KVM's perspective the host value is effectively constant.

> > > +		vmcs_writel(HOST_S_CET, msr);
> > > +	}
> > > +
> > >   	/* The actual VMENTER/EXIT is in the .noinstr.text section. */
> > >   	vmx_vcpu_enter_exit(vcpu, __vmx_vcpu_run_flags(vmx));
> > > @@ -7735,6 +7757,13 @@ static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu)
> > >   	incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
> > >   	vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, incpt);
> > > +
> > > +	/*
> > > +	 * If IBT is available to guest, then passthrough S_CET MSR too since
> > > +	 * kernel IBT is already in mainline kernel tree.
> > > +	 */
> > > +	incpt = !guest_cpuid_has(vcpu, X86_FEATURE_IBT);
> > > +	vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, incpt);
> > >   }
> > >   static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > > @@ -7805,7 +7834,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > >   	/* Refresh #PF interception to account for MAXPHYADDR changes. */
> > >   	vmx_update_exception_bitmap(vcpu);
> > > -	if (kvm_cet_user_supported())
> > > +	if (kvm_cet_user_supported() || kvm_cpu_cap_has(X86_FEATURE_IBT))
> > Yeah, kvm_cet_user_supported() simply looks wrong.
> 
> These are preconditions to set up CET MSRs for guest, in
> vmx_update_intercept_for_cet_msr(),
> 
> the actual MSR control is based on guest_cpuid_has() results.

I know.  My point is that with the below combination, 

	kvm_cet_user_supported()		= true
	kvm_cpu_cap_has(X86_FEATURE_IBT)	= false 
	guest_cpuid_has(vcpu, X86_FEATURE_IBT)	= true

KVM will passthrough MSR_IA32_S_CET for guest IBT even though IBT isn't supported
on the host.

	incpt = !guest_cpuid_has(vcpu, X86_FEATURE_IBT);
	vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, incpt);

So either KVM is broken and is passing through S_CET when it shouldn't, or the
check on kvm_cet_user_supported() is redundant, i.e. the above combination is
impossible.

Either way, the code *looks* wrong, which is almost as bad as it being functionally
wrong.