linux-kernel - Re: [PATCH v5 03/26] x86/hyperv: Update 'struct hv_enlightened

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YwOm7Ph54vIYAllm@google.com>
Date:   Mon, 22 Aug 2022 15:55:24 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Vitaly Kuznetsov <vkuznets@...hat.com>
Cc:     kvm@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>,
        Anirudh Rayabharam <anrayabh@...ux.microsoft.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Maxim Levitsky <mlevitsk@...hat.com>,
        Nathan Chancellor <nathan@...nel.org>,
        Michael Kelley <mikelley@...rosoft.com>,
        linux-hyperv@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 03/26] x86/hyperv: Update 'struct hv_enlightened_vmcs'
 definition

On Mon, Aug 22, 2022, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@...gle.com> writes:
> 
> > On Thu, Aug 18, 2022, Vitaly Kuznetsov wrote:
> >> Sean Christopherson <seanjc@...gle.com> writes:
> >> 
> >> > On Tue, Aug 02, 2022, Vitaly Kuznetsov wrote:
> >> >> + * Note: HV_X64_NESTED_EVMCS1_2022_UPDATE is not currently documented in any
> >> >> + * published TLFS version. When the bit is set, nested hypervisor can use
> >> >> + * 'updated' eVMCSv1 specification (perf_global_ctrl, s_cet, ssp, lbr_ctl,
> >> >> + * encls_exiting_bitmap, tsc_multiplier fields which were missing in 2016
> >> >> + * specification).
> >> >> + */
> >> >> +#define HV_X64_NESTED_EVMCS1_2022_UPDATE		BIT(0)
> >> >
> >> > This bit is now defined[*], but the docs says it's only for perf_global_ctrl.  Are
> >> > we expecting an update to the TLFS?
> >> >
> >> > 	Indicates support for the GuestPerfGlobalCtrl and HostPerfGlobalCtrl fields
> >> > 	in the enlightened VMCS.
> >> >
> >> > [*] https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/feature-discovery#hypervisor-nested-virtualization-features---0x4000000a
> >> >
> >> 
> >> Oh well, better this than nothing. I'll ping the people who told me
> >> about this bit that their description is incomplete.
> >
> > Not that it changes anything, but I'd rather have no documentation.  I'd much rather
> > KVM say "this is the undocumented behavior" than "the document behavior is wrong".
> >
> 
> So I reached out to Microsoft and their answer was that for all these new
> eVMCS fields (including *PerfGlobalCtrl) observing architectural VMX
> MSRs should be enough. *PerfGlobalCtrl case is special because of Win11
> bug (if we expose the feature in VMX feature MSRs but don't set
> CPUID.0x4000000A.EBX BIT(0) it just doesn't boot).

I.e. TSC_SCALING shouldn't be gated on the flag?  If so, then the 2-D array approach
is overkill since (a) the CPUID flag only controls PERF_GLOBAL_CTRL and (b) we aren't
expecting any more flags in the future.

What about this for an implementation?

static bool evmcs_has_perf_global_ctrl(struct kvm_vcpu *vcpu)
{
	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);

	/*
	 * Filtering VMX controls for eVMCS compatibility should only be done
	 * for guest accesses, and all such accesses should be gated on Hyper-V
	 * being enabled and initialized.
	 */
	if (WARN_ON_ONCE(!hv_vcpu))
		return false;

	return hv_vcpu->cpuid_cache.nested_ebx & HV_X64_NESTED_EVMCS1_PERF_GLOBAL_CTRL;
}

static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu, u32 msr_index)
{
	u32 unsupported_ctrls;

	switch (msr_index) {
	case MSR_IA32_VMX_EXIT_CTLS:
	case MSR_IA32_VMX_TRUE_EXIT_CTLS:
		unsupported_ctrls = EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
		if (!evmcs_has_perf_global_ctrl(vcpu))
			unsupported_ctrls |= VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
		return unsupported_ctrls;
	case MSR_IA32_VMX_ENTRY_CTLS:
	case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
		unsupported_ctrls = EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
		if (!evmcs_has_perf_global_ctrl(vcpu))
			unsupported_ctrls |= VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
		return unsupported_ctrls;
	case MSR_IA32_VMX_PROCBASED_CTLS2:
		return EVMCS1_UNSUPPORTED_2NDEXEC;
	case MSR_IA32_VMX_TRUE_PINBASED_CTLS:
	case MSR_IA32_VMX_PINBASED_CTLS:
		return EVMCS1_UNSUPPORTED_PINCTRL;
	case MSR_IA32_VMX_VMFUNC:
		return EVMCS1_UNSUPPORTED_VMFUNC;
	default:
		KVM_BUG_ON(1, vcpu->kvm);
		return 0;
	}
}

void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
{
	u64 unsupported_ctrls = evmcs_get_unsupported_ctls(vcpu, msr_index);

	if (msr_index == MSR_IA32_VMX_VMFUNC)
		*pdata &= ~unsupported_ctrls;
	else
		*pdata &= ~(unsupported_ctrls << 32);
}


> What I'm still concerned about is future proofing KVM for new
> features. When something is getting added to KVM for which no eVMCS
> field is currently defined, both Hyper-V-on-KVM and KVM-on-Hyper-V cases
> should be taken care of. It would probably be better to reverse our
> filtering, explicitly listing features supported in eVMCS. The lists are
> going to be fairly long but at least we won't have to take care of any
> new architectural feature added to KVM.

Having the filtering be opt-in crossed my mind as well.  Reversing the filtering
can be done after this series though, correct?