[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zle29YsDN5Hff7Lo@google.com>
Date: Wed, 29 May 2024 16:15:01 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Kai Huang <kai.huang@...el.com>
Cc: Isaku Yamahata <isaku.yamahata@...el.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"isaku.yamahata@...il.com" <isaku.yamahata@...il.com>, Paolo Bonzini <pbonzini@...hat.com>,
Erdem Aktas <erdemaktas@...gle.com>, Sagi Shahar <sagis@...gle.com>, Bo2 Chen <chen.bo@...el.com>,
Hang Yuan <hang.yuan@...el.com>, Tina Zhang <tina.zhang@...el.com>,
isaku.yamahata@...ux.intel.com
Subject: Re: [PATCH v19 037/130] KVM: TDX: Make KVM_CAP_MAX_VCPUS backend specific
On Tue, May 14, 2024, Kai Huang wrote:
>
>
> On 11/05/2024 2:04 am, Sean Christopherson wrote:
> > On Thu, May 09, 2024, Isaku Yamahata wrote:
> > > On Fri, May 10, 2024 at 11:19:44AM +1200, Kai Huang <kai.huang@...el.com> wrote:
> > > > On 10/05/2024 10:52 am, Sean Christopherson wrote:
> > > > > On Fri, May 10, 2024, Kai Huang wrote:
> > > > > > On 10/05/2024 4:35 am, Sean Christopherson wrote:
> > > > > > > KVM x86 limits KVM_MAX_VCPUS to 4096:
> > > > > > >
> > > > > > > config KVM_MAX_NR_VCPUS
> > > > > > > int "Maximum number of vCPUs per KVM guest"
> > > > > > > depends on KVM
> > > > > > > range 1024 4096
> > > > > > > default 4096 if MAXSMP
> > > > > > > default 1024
> > > > > > > help
> > > > > > >
> > > > > > > whereas the limitation from TDX is apprarently simply due to TD_PARAMS taking
> > > > > > > a 16-bit unsigned value:
> > > > > > >
> > > > > > > #define TDX_MAX_VCPUS (~(u16)0)
> > > > > > >
> > > > > > > i.e. it will likely be _years_ before TDX's limitation matters, if it ever does.
> > > > > > > And _if_ it becomes a problem, we don't necessarily need to have a different
> > > > > > > _runtime_ limit for TDX, e.g. TDX support could be conditioned on KVM_MAX_NR_VCPUS
> > > > > > > being <= 64k.
> > > > > >
> > > > > > Actually later versions of TDX module (starting from 1.5 AFAICT), the module
> > > > > > has a metadata field to report the maximum vCPUs that the module can support
> > > > > > for all TDX guests.
> > > > >
> > > > > My quick glance at the 1.5 source shows that the limit is still effectively
> > > > > 0xffff, so again, who cares? Assert on 0xffff compile time, and on the reported
> > > > > max at runtime and simply refuse to use a TDX module that has dropped the minimum
> > > > > below 0xffff.
> > > >
> > > > I need to double check why this metadata field was added. My concern is in
> > > > future module versions they may just low down the value.
> > >
> > > TD partitioning would reduce it much.
> >
> > That's still not a reason to plumb in what is effectively dead code. Either
> > partitioning is opt-in, at which I suspect KVM will need yet more uAPI to express
> > the limitations to userspace, or the TDX-module is potentially breaking existing
> > use cases.
>
> The 'max_vcpus_per_td' global metadata fields is static for the TDX module.
> If the module supports the TD partitioning, it just reports some smaller
> value regardless whether we opt-in TDX partitioning or not.
>
> I think the point is this 'max_vcpus_per_td' is TDX architectural thing and
> kernel should not make any assumption of the value of it.
It's not an assumption, it's a requirement. And KVM already places requirements
on "hardware", e.g. kvm-intel.ko will refuse to load if the CPU doesn't support
a bare mimimum VMX feature set. Refusing to enable TDX because max_vcpus_per_td
is unexpectedly low isn't fundamentally different than refusing to enable VMX
because IRQ window exiting is unsupported.
In the unlikely event there is a legitimate reason for max_vcpus_per_td being
less than KVM's minimum, then we can update KVM's minimum as needed. But AFAICT,
that's purely theoretical at this point, i.e. this is all much ado about nothing.
Powered by blists - more mailing lists