[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yl2AaxXFh7UfvpFx@google.com>
Date: Mon, 18 Apr 2022 15:14:51 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Chao Gao <chao.gao@...el.com>
Cc: Zeng Guang <guang.zeng@...el.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
Dave Hansen <dave.hansen@...ux.intel.com>,
Tony Luck <tony.luck@...el.com>,
Kan Liang <kan.liang@...ux.intel.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>,
Kim Phillips <kim.phillips@....com>,
Jarkko Sakkinen <jarkko@...nel.org>,
Jethro Beekman <jethro@...tanix.com>,
Kai Huang <kai.huang@...el.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, Robert Hu <robert.hu@...el.com>
Subject: Re: [PATCH v8 9/9] KVM: VMX: enable IPI virtualization
On Mon, Apr 18, 2022, Chao Gao wrote:
> On Fri, Apr 15, 2022 at 03:25:06PM +0000, Sean Christopherson wrote:
> >On Mon, Apr 11, 2022, Zeng Guang wrote:
> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >> index d1a39285deab..23fbf52f7bea 100644
> >> --- a/arch/x86/kvm/x86.c
> >> +++ b/arch/x86/kvm/x86.c
> >> @@ -11180,11 +11180,15 @@ static int sync_regs(struct kvm_vcpu *vcpu)
> >>
> >> int kvm_arch_vcpu_precreate(struct kvm *kvm, unsigned int id)
> >> {
> >> + int ret = 0;
> >> +
> >> if (kvm_check_tsc_unstable() && atomic_read(&kvm->online_vcpus) != 0)
> >> pr_warn_once("kvm: SMP vm created on host with unstable TSC; "
> >> "guest TSC will not be reliable\n");
> >>
> >> - return 0;
> >> + if (kvm_x86_ops.alloc_ipiv_pid_table)
> >> + ret = static_call(kvm_x86_alloc_ipiv_pid_table)(kvm);
> >
> >Add a generic kvm_x86_ops.vcpu_precreate, no reason to make this so specific.
> >And use KVM_X86_OP_RET0 instead of KVM_X86_OP_OPTIONAL, then this can simply be
> >
> > return static_call(kvm_x86_vcpu_precreate);
> >
> >That said, there's a flaw in my genius plan.
> >
> > 1. KVM_CREATE_VM
> > 2. KVM_CAP_MAX_VCPU_ID, set max_vcpu_ids=1
> > 3. KVM_CREATE_VCPU, create IPIv table but ultimately fails
> > 4. KVM decrements created_vcpus back to '0'
> > 5. KVM_CAP_MAX_VCPU_ID, set max_vcpu_ids=4096
> > 6. KVM_CREATE_VCPU w/ ID out of range
> >
> >In other words, malicious userspace could trigger buffer overflow.
>
> can we simply return an error (e.g., -EEXIST) on step 5 (i.e.,
> max_vcpu_ids cannot be changed after being set once)?
>
> or
>
> can we detect the change of max_vcpu_ids in step 6 and re-allocate PID
> table?
Returning an error is viable, but would be a rather odd ABI. Re-allocating isn't
a good option because the PID table could be in active use by other vCPUs, e.g.
KVM would need to send a request and kick all vCPUs to have all vCPUs update their
VMCS.
And with both of those alternatives, I still don't like that every feature that
acts on max_vcpu_ids would need to handle this same edge case.
An alternative to another new ioctl() would be to to make KVM_CAP_MAX_VCPU_ID
write-once, i.e. reject attempts to change the max once set (though we could allow
re-writing the same value). I think I like that idea better than adding an ioctl().
It can even be done without an extra flag by zero-initializing the field and instead
waiting until vCPU pre-create to lock in the value. That would also help detect
bad usage of max_vcpu_ids, especially if we added a wrapper to get the value, e.g.
the wrapper could WARN_ON(!kvm->arch.max_vcpu_ids).
E.g.
int kvm_arch_vcpu_precreate(struct kvm *kvm, unsigned int id)
{
if (kvm_check_tsc_unstable() && atomic_read(&kvm->online_vcpus) != 0)
pr_warn_once("kvm: SMP vm created on host with unstable TSC; "
"guest TSC will not be reliable\n");
if (!kvm->arch.max_vcpu_ids)
kvm->arch.max_vcpu_ids = KVM_MAX_VCPU_IDS;
return 0;
}
case KVM_CAP_MAX_VCPU_ID:
r = -EINVAL;
if (cap->args[0] > KVM_MAX_VCPU_IDS)
break;
mutex_lock(&kvm->lock);
if (kvm->arch.max_vcpu_ids == cap->args[0]) {
r = 0;
} else if (!kvm->arch.max_vcpu_ids) {
kvm->arch.max_vcpu_ids = cap->args[0];
r = 0;
}
mutex_unlock(&kvm->lock);
break;
Powered by blists - more mailing lists