linux-kernel - Re: [PATCH] Revert "KVM: nVMX: Expose load IA32_PERF_GLOBAL

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <YtrRg03bsPVJLSBx@google.com>
Date:   Fri, 22 Jul 2022 16:34:11 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        oliver.upton@...ux.dev
Subject: Re: [PATCH] Revert "KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL
 VM-{Entry,Exit} control"

On Fri, Jul 22, 2022, Paolo Bonzini wrote:
> This reverts commit 03a8871add95213827e2bea84c12133ae5df952e.
> 
> Since commit 03a8871add95 ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL
> VM-{Entry,Exit} control"), KVM has taken ownership of the "load
> IA32_PERF_GLOBAL_CTRL" VMX entry/exit control bits, trying to set these
> bits in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS MSRs if the guest's CPUID
> supports the architectural PMU (CPUID[EAX=0Ah].EAX[7:0]=1), and clear
> otherwise.
> 
> This was a misguided attempt at mimicking what commit 5f76f6f5ff96
> ("KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled",
> 2018-10-01) did for MPX.  However, that commit was a workaround for
> another KVM bug and not something that should be imitated.  Mucking with
> the VMX MSRs creates a subtle, difficult to maintain ABI as KVM must
> ensure that any internal changes, e.g. to how KVM handles _any_ guest
> CPUID changes, yield the same functional result.  Therefore, KVM's policy
> is to let userspace have full control of the guest vCPU model so long
> as the host kernel is not at risk.
> 
> And that's the snag: setting the bit must not cause any harm to the host,
> therefore we need to be sure that the kvm_set_msr will actually succeed.

() on functions please.

> Furthermore, it is plausible to have a hypervisor that sets the controls
> unconditionally and just leaves GUEST/HOST_IA32_PERF_GLOBAL_CTRL to 0, and
> we don't want to regress that case.  The simplest way to handle
> both issues is to skip the call to kvm_set_msr if the value of
> MSR_CORE_PERF_GLOBAL_CTRL is not changing.  This covers trivially the case
> where the PMU is not available and the only acceptable value of the MSR is
> zero, because nonzero values are filtered in nested_vmx_check_host_state

()

> and nested_vmx_check_guest_state.

Hmm, this is just trading one hack for another.  The real problem is that KVM
has a WARN that can be triggered by userspace sending a misconfigured vCPU model.

And calling kvm_set_msr() iff the value is changing is trivial to exploit, e.g.
userspace does KVM_SET_CPUID with a valid PMU and stuffs MSR_CORE_PERF_GLOBAL_CTRL
to a non-zero value then does a "bad" KVM_SET_CPUID and a nested VM-Enter.  An
even more devious attack would be to do back-to-back KVM_SET_CPUID to get a valid
pmu->global_ctrl_mask with pmu->version==0 (which is a bug in intel_pmu_refresh())
in order to bypass this check:

	if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL) &&
	    CC(!kvm_valid_perf_global_ctrl(vcpu_to_pmu(vcpu),
					   vmcs12->guest_ia32_perf_global_ctrl)))
		return -EINVAL;

and then nested VM-Enter with a non-zero guest_ia32_perf_global_ctrl.

Blech, I was going to type up a suggested "flow", but untangling this requires
multiple patches (there are multiple bugs).  I'll just send a series with this as
a pure revert of 03a8871add95213827e2bea84c12133ae5df952e as the last patch.

> Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
> ---
>  arch/x86/kvm/vmx/nested.c    | 26 +++-----------------------
>  arch/x86/kvm/vmx/nested.h    |  2 --
>  arch/x86/kvm/vmx/pmu_intel.c |  3 ---
>  3 files changed, 3 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index c1c85fd75d42..6d25de9ebefa 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2623,6 +2623,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
>  	}
>  
>  	if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL) &&
> +	    vmcs12->guest_ia32_perf_global_ctrl != vcpu_to_pmu(vcpu)->global_ctrl &&
>  	    WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL,
>  				     vmcs12->guest_ia32_perf_global_ctrl))) {
>  		*entry_failure_code = ENTRY_FAIL_DEFAULT;
> @@ -4333,7 +4334,8 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
>  		vmcs_write64(GUEST_IA32_PAT, vmcs12->host_ia32_pat);
>  		vcpu->arch.pat = vmcs12->host_ia32_pat;
>  	}
> -	if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL)
> +	if ((vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL)
> +	    && vmcs12->host_ia32_perf_global_ctrl != vcpu_to_pmu(vcpu)->global_ctrl)

Should be a moot point, but put the "&&" on the first line.

>  		WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL,
>  					 vmcs12->host_ia32_perf_global_ctrl));
>