[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20221013085554.GA2756200@ls.amr.corp.intel.com>
Date: Thu, 13 Oct 2022 01:55:54 -0700
From: Isaku Yamahata <isaku.yamahata@...il.com>
To: Sagi Shahar <sagis@...gle.com>
Cc: isaku.yamahata@...el.com, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, isaku.yamahata@...il.com,
Paolo Bonzini <pbonzini@...hat.com>, erdemaktas@...gle.com,
Sean Christopherson <seanjc@...gle.com>,
Sean Christopherson <sean.j.christopherson@...el.com>,
Kai Huang <kai.huang@...el.com>
Subject: Re: [PATCH v9 016/105] KVM: TDX: create/destroy VM structure
On Wed, Oct 12, 2022 at 03:30:26PM -0700,
Sagi Shahar <sagis@...gle.com> wrote:
> > +int tdx_vm_init(struct kvm *kvm)
> > +{
> > + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
> > + cpumask_var_t packages;
> > + int ret, i;
> > + u64 err;
> > +
> > + /* vCPUs can't be created until after KVM_TDX_INIT_VM. */
> > + kvm->max_vcpus = 0;
>
> The fact that vCPUs can't be created until KVM_TDX_INIT_VM is called
> will make it difficult to implement intra host migration. See longer
> discussion below.
...
> Me, Sean and Isaku had a short discussion offline regarding the
> interaction between the proposed API in this patch and intra-host
> migration. To summarize:
>
> For intra-host migration you generally want the destination VM to be
> initialized including the right number of vCPUs before you migrate the
> source VM state into it.
> The proposed API makes it difficult since it forces the destination VM
> to call KVM_TDX_INIT_VM before creating vCPUs which initializes TDX
> state and allocate a new hkid for the destination VM which would never
> be used. This can create a resource limitation on migrating VMs where
> there shouldn't be one.
>
> To solve this issue there are 2 main proposed changes to the API:
>
> 1. Add a new API based on ioctl(KVM_ENABLE_CAP) to let userspace
> modify the max number of vcpus:
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 43a6a7efc6ec..6055098b025b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6278,6 +6278,18 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> }
> mutex_unlock(&kvm->lock);
> break;
> + case KVM_CAP_MAX_VCPUS:
> + r = -EINVAL;
> + if (cap->args[0] > KVM_MAX_VCPUS)
> + break;
> +
> + mutex_lock(&kvm->lock);
> + if (!kvm->created_vcpus) {
> + kvm->max_vcpus = cap->args[0];
> + r = 0;
> + }
> + mutex_unlock(&kvm->lock);
> + break;
> case KVM_CAP_MAX_VCPU_ID:
> r = -EINVAL;
> if (cap->args[0] > KVM_MAX_VCPU_IDS)
>
> 2. Modify the existing API such that max_vcpus will be set to
> KVM_MAX_VCPUS like in regular VMs and during KVM_TDX_INIT_VM, if the
> user created more vCPUs than the number specified, KVM_TDX_INIT_VM
> will fail.
>
> For option (1), there are some possible variations:
> 1.a. Do we keep the max_vcpus argument in KVM_TDX_INIT_VM? If so, we
> need to check if max_vcpus matches the number of max_vcpus already set
> and fail otherwise.
> 1.b. Do we require KVM_ENABLE_CAP_VM(KVM_CAP_MAX_VCPUS) to be called?
> Theoretically, we can set max_vcpus to the KVM default KVM_MAX_VCPUS
> and allow the user to change it as long as vcpus hasn't been created.
> If KVM_ENABLE_CAP_VM(KVM_CAP_MAX_VCPUS), the behavior will remain the
> same as regular VMs right now.
>
> In my opinion, the cleanest solution would be option 1 (new
> KVM_CAP_MAX_VCPUS API) while removing the max_vcpus argument from
> KVM_TDX_INIT_VM and setting the initial max_vcpus to KVM_MAX_VCPUS and
> not requiring the new ioctl to be called unless userspace wants to
> specifically limit the number of vcpus. In that case,
> KVM_CAP_MAX_VCPUS can be called at any time until vcpus are created.
Regarding to KVM_CAP_MAX_CPUS vs KVM_TDX_INIT_VM, KVM_CAP_MAX_CPUS is more
generic, KVM_CAP_MAX_CPUS would be better. This follows tsc frequency.
If option (1) is adapted, the logic should go to the common code, i.e. under
linux/virt/kvm/, because there is nothing specific to x86. I don't see any use
case other than TDX, though.
--
Isaku Yamahata <isaku.yamahata@...il.com>
Powered by blists - more mailing lists