[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YtrRg03bsPVJLSBx@google.com>
Date: Fri, 22 Jul 2022 16:34:11 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
oliver.upton@...ux.dev
Subject: Re: [PATCH] Revert "KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL
VM-{Entry,Exit} control"
On Fri, Jul 22, 2022, Paolo Bonzini wrote:
> This reverts commit 03a8871add95213827e2bea84c12133ae5df952e.
>
> Since commit 03a8871add95 ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL
> VM-{Entry,Exit} control"), KVM has taken ownership of the "load
> IA32_PERF_GLOBAL_CTRL" VMX entry/exit control bits, trying to set these
> bits in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS MSRs if the guest's CPUID
> supports the architectural PMU (CPUID[EAX=0Ah].EAX[7:0]=1), and clear
> otherwise.
>
> This was a misguided attempt at mimicking what commit 5f76f6f5ff96
> ("KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled",
> 2018-10-01) did for MPX. However, that commit was a workaround for
> another KVM bug and not something that should be imitated. Mucking with
> the VMX MSRs creates a subtle, difficult to maintain ABI as KVM must
> ensure that any internal changes, e.g. to how KVM handles _any_ guest
> CPUID changes, yield the same functional result. Therefore, KVM's policy
> is to let userspace have full control of the guest vCPU model so long
> as the host kernel is not at risk.
>
> And that's the snag: setting the bit must not cause any harm to the host,
> therefore we need to be sure that the kvm_set_msr will actually succeed.
() on functions please.
> Furthermore, it is plausible to have a hypervisor that sets the controls
> unconditionally and just leaves GUEST/HOST_IA32_PERF_GLOBAL_CTRL to 0, and
> we don't want to regress that case. The simplest way to handle
> both issues is to skip the call to kvm_set_msr if the value of
> MSR_CORE_PERF_GLOBAL_CTRL is not changing. This covers trivially the case
> where the PMU is not available and the only acceptable value of the MSR is
> zero, because nonzero values are filtered in nested_vmx_check_host_state
()
> and nested_vmx_check_guest_state.
Hmm, this is just trading one hack for another. The real problem is that KVM
has a WARN that can be triggered by userspace sending a misconfigured vCPU model.
And calling kvm_set_msr() iff the value is changing is trivial to exploit, e.g.
userspace does KVM_SET_CPUID with a valid PMU and stuffs MSR_CORE_PERF_GLOBAL_CTRL
to a non-zero value then does a "bad" KVM_SET_CPUID and a nested VM-Enter. An
even more devious attack would be to do back-to-back KVM_SET_CPUID to get a valid
pmu->global_ctrl_mask with pmu->version==0 (which is a bug in intel_pmu_refresh())
in order to bypass this check:
if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL) &&
CC(!kvm_valid_perf_global_ctrl(vcpu_to_pmu(vcpu),
vmcs12->guest_ia32_perf_global_ctrl)))
return -EINVAL;
and then nested VM-Enter with a non-zero guest_ia32_perf_global_ctrl.
Blech, I was going to type up a suggested "flow", but untangling this requires
multiple patches (there are multiple bugs). I'll just send a series with this as
a pure revert of 03a8871add95213827e2bea84c12133ae5df952e as the last patch.
> Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
> ---
> arch/x86/kvm/vmx/nested.c | 26 +++-----------------------
> arch/x86/kvm/vmx/nested.h | 2 --
> arch/x86/kvm/vmx/pmu_intel.c | 3 ---
> 3 files changed, 3 insertions(+), 28 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index c1c85fd75d42..6d25de9ebefa 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2623,6 +2623,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> }
>
> if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL) &&
> + vmcs12->guest_ia32_perf_global_ctrl != vcpu_to_pmu(vcpu)->global_ctrl &&
> WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL,
> vmcs12->guest_ia32_perf_global_ctrl))) {
> *entry_failure_code = ENTRY_FAIL_DEFAULT;
> @@ -4333,7 +4334,8 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
> vmcs_write64(GUEST_IA32_PAT, vmcs12->host_ia32_pat);
> vcpu->arch.pat = vmcs12->host_ia32_pat;
> }
> - if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL)
> + if ((vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL)
> + && vmcs12->host_ia32_perf_global_ctrl != vcpu_to_pmu(vcpu)->global_ctrl)
Should be a moot point, but put the "&&" on the first line.
> WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL,
> vmcs12->host_ia32_perf_global_ctrl));
>
Powered by blists - more mailing lists