[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6b1ec845-4e24-4aa9-b262-49b2fc57553f@linux.intel.com>
Date: Thu, 28 Aug 2025 18:11:44 +0800
From: Binbin Wu <binbin.wu@...ux.intel.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
"Hunter, Adrian" <adrian.hunter@...el.com>,
"seanjc@...gle.com" <seanjc@...gle.com>
Cc: "Gao, Chao" <chao.gao@...el.com>, "Huang, Kai" <kai.huang@...el.com>,
"Li, Xiaoyao" <xiaoyao.li@...el.com>,
"Chatre, Reinette" <reinette.chatre@...el.com>,
"kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
"tony.lindgren@...ux.intel.com" <tony.lindgren@...ux.intel.com>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"Yamahata, Isaku" <isaku.yamahata@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Zhao, Yan Y" <yan.y.zhao@...el.com>, "Weiny, Ira" <ira.weiny@...el.com>
Subject: Re: [PATCH RFC 1/2] KVM: TDX: Disable general support for MWAIT in
guest
On 8/19/2025 11:59 PM, Edgecombe, Rick P wrote:
> On Tue, 2025-08-19 at 13:40 +0800, Binbin Wu wrote:
>> Currently, KVM TDX code filters out TSX (HLE or RTM) and WAITPKG using
>> tdx_clear_unsupported_cpuid(), which is sort of blacklist.
>>
>> I am wondering if we could add another array, e.g., tdx_cpu_caps[], which is the
>> TDX version of kvm_cpu_caps[].
>>
>> Using tdx_cpu_caps[] is a whitelist way.
> We had something like this in some of the earlier revisions of the TDX CPUID
> configuration.
>
>> For a new feature
>> - If the developer doesn't know anything about TDX, the bit just be added to
>> kvm_cpu_caps[].
>> - If the developer knows that the feature supported by both non-TDX VMs and TDs
>> (either the feature doesn't require any additional virtualization support or
>> the virtualization support is added for TDX), extend the macros to set the bit
>> both in kvm_cpu_caps[] and tdx_cpu_caps[].
>> - If there is a feature not supported by non-TDX VMs, but supported by TDs,
>> extend the macros to set the bit only in tdx_cpu_caps[].
>> So, tdx_cpu_caps[] could be used as the filter of configurable bits reported
>> to userspace.
> In some ways this is the simplest, but having to maintain a big list in KVM was
> not ideal.
Agree.
> The original solution started with KVM_GET_SUPPORTED_CPUID and then
> massaged the results to fit, so maybe just encoding the whole thing separately
> is enough to reconsider it.
>
> But what I was thinking is that we could most of that hardcoded list into the
> TDX module, and only keep a list of non-trivial features (i.e. not simple
> instruction CPUID bits) in KVM. The list of simple features (definition TBD)
> could be provided by the TDX module.
It sounds like a good idea.
Either a list of simple features, or the opposite version is OK.
TDX module already provided the interface to get directly configurable bits.
VMM can get the other part by masking.
But providing a list of non-trivial features may be more direct.
I think non-trivial features should cover both cases:
- a feature clobbers host state
- a feature that requires additional para-virtualization support in VMM. E.g,
the feature related MSR(s) should be virtualized by VMM. Without proper
para-virtualization support in VMM, the guest will experience functionality
issue when using the feature
> So KVM could do the full filtering but only
> keep a list that today would just look like TSX and WAITPKG that we already
> have. So basically the same as what you are proposing, but just shrinks the size
> of list KVM has to keep.
>
>> Comparing to blacklist (i.e., tdx_clear_unsupported_cpuid()), there is no risk
>> that a feature not supported by TDX is forgotten to be added to the blacklist.
>> Also, tdx_cpu_caps[] could support a feature that not supported for non-TDX VMs.
> We definitely can't have TDX module adding any host affecting features that we
> would automatically allow. And having a separate opt-in interface that doesn't
> "speak" cpuid bits is going to just complicate the already complicated logic
> that is in QEMU.
With the list of non-trivial features, VMM can prevent userspace from setting
any bit in the list not supported by VMM.
So can KVM only enforce the consistency for non-trivial feature bits? After all,
these bits are really matters from KVM's view.
If letting userspace, KVM and TDX module have a consistent view of CPUIDs for a
TD is still a target. When a new fixed1 bit is added in a new TDX spec, it still
requires an opt-in interface to allow userspace to get the full picture. Also,
userspace doesn't know which opt-in options are available unless TDX module
provide another interface to report them... yeah, very complicated :(
Ideally, if TDX module never adds new fixed1 bit (including new defined and
converted from other types), or convert a fixed1 bit to fixed0 bit, then
userspace can calculate the right fixed1 bits based on the base spec and the
directly configurable bits without separate opt-in interface.
>
>> Then we don't need a host opt-in for these directly configurable bits not
>> clobbering host states.
>>
>> Of course, to prevent userspace from setting feature bit that would clobber host
>> state, but not included in tdx_cpu_caps[], I think a new feature that would
>> clobber host state should requires a host opt-in to TDX module.
> Yes, but if have some way to get the host clobbering type info programatically
> we could keep the host opt-in as part of the main CPUID bit configuration. What
> I think will be bad is if we grow a separate protocol of opt-ins. KVM and QEMU
> manage everything with CPUID, so it will be easier if we stick to that.
>
Agree.
Powered by blists - more mailing lists