[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <df05e4fe-a50b-49a8-9ea0-2077cb061c44@intel.com>
Date: Thu, 12 Sep 2024 22:45:39 +0800
From: Xiaoyao Li <xiaoyao.li@...el.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: Rick Edgecombe <rick.p.edgecombe@...el.com>, seanjc@...gle.com,
kvm@...r.kernel.org, kai.huang@...el.com, isaku.yamahata@...il.com,
tony.lindgren@...ux.intel.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 25/25] KVM: x86: Add CPUID bits missing from
KVM_GET_SUPPORTED_CPUID
On 9/12/2024 10:09 PM, Paolo Bonzini wrote:
> On Thu, Sep 12, 2024 at 9:48 AM Xiaoyao Li <xiaoyao.li@...el.com> wrote:
>> On 9/11/2024 1:52 AM, Paolo Bonzini wrote:
>>> On 8/13/24 00:48, Rick Edgecombe wrote:
>>>> For KVM_TDX_CAPABILITIES, there was also the problem of bits that are
>>>> actually supported by KVM, but missing from get_supported_cpuid() for one
>>>> reason or another. These include X86_FEATURE_MWAIT, X86_FEATURE_HT and
>>>> X86_FEATURE_TSC_DEADLINE_TIMER. This is currently worked around in
>>>> QEMU by
>>>> adjusting which features are expected.
>>
>> I'm not sure what issue/problem can be worked around in QEMU.
>> QEMU doesn't expect these bit are reported by KVM as supported for TDX.
>> QEMU just accepts the result reported by KVM.
>
> QEMU already adds some extra bits, for example:
>
> ret |= CPUID_EXT_HYPERVISOR;
> if (kvm_irqchip_in_kernel() &&
> kvm_check_extension(s, KVM_CAP_TSC_DEADLINE_TIMER)) {
> ret |= CPUID_EXT_TSC_DEADLINE_TIMER;
> }
>
>> The problem is, TDX module and the hardware allow these bits be
>> configured for TD guest, but KVM doesn't allow. It leads to users cannot
>> create a TD with these bits on.
>
> KVM is not going to have any checks, it's only going to pass the
> CPUID to the TDX module and return an error if the check fails
> in the TDX module.
If so, new feature can be enabled for TDs out of KVM's control.
Is it acceptable?
> KVM can have a TDX-specific version of KVM_GET_SUPPORTED_CPUID, so
> that we can keep a variant of the "get supported bits and pass them
> to KVM_SET_CPUID2" logic, but that's it.
>
>>> This is the kind of API that we need to present for TDX, even if the
>>> details on how to get the supported CPUID are different. Not because
>>> it's a great API, but rather because it's a known API.
>>
>> However there are differences for TDX. For legacy VMs, the result of
>> KVM_GET_SUPPORTED_CPUID isn't used to filter the input of KVM_SET_CPUID2.
>> But for TDX, it needs to filter the input of KVM_TDX_VM_INIT.CPUID[]
>> because TDX module only allows the bits that are reported as
>> configurable to be set to 1.
>
> Yes, that's userspace's responsibility.
>
>> With current designed API, QEMU can only know which bits are
>> configurable before KVM_TDX_VM_INIT, i.e., which bits can be set to 1 or
>> 0 freely.
>
> The API needs userspace to have full knowledge of the
> requirements of the TDX module, if it wants to change the
> defaults provided by KVM.
>
> This is the same as for non-TDX VMs (including SNP). The only
> difference is that TDX and SNP fails, while non-confidential VMs
> get slightly garbage CPUID.
>
>> For other bits not reported as configurable, QEMU can know the exact
>> value of them via KVM_TDX_GET_CPUID, after KVM_TDX_VM_INIT and before
>> TD's running. With it, QEMU can validate the return value is matched
>> with what QEMU wants to set that determined by users input. If not
>> matched, QEMU can provide some warnings like what for legacy VMs:
>>
>> - TDX doesn't support requested feature: CPUID.01H.ECX.tsc-deadline
>> [bit 24]
>> - TDX forcibly sets features: CPUID.01H:ECX.hypervisor [bit 31]
>>
>> If there are ioctls to report the fixed0 bits and fixed1 bits for TDX,
>> QEMU can validate the user's configuration earlier.
>
> Yes, that's fine.
>
> Paolo
>
Powered by blists - more mailing lists