[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51475a06-7775-4bbb-b53c-615796df8417@intel.com>
Date: Tue, 14 Jan 2025 22:04:16 +0200
From: Adrian Hunter <adrian.hunter@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Chao Gao <chao.gao@...el.com>, pbonzini@...hat.com, kvm@...r.kernel.org,
dave.hansen@...ux.intel.com, rick.p.edgecombe@...el.com,
kai.huang@...el.com, reinette.chatre@...el.com, xiaoyao.li@...el.com,
tony.lindgren@...ux.intel.com, binbin.wu@...ux.intel.com,
dmatlack@...gle.com, isaku.yamahata@...el.com, nik.borisov@...e.com,
linux-kernel@...r.kernel.org, x86@...nel.org, yan.y.zhao@...el.com,
weijiang.yang@...el.com
Subject: Re: [PATCH 4/7] KVM: TDX: restore host xsave state when exit from the
guest TD
On 10/01/25 19:30, Sean Christopherson wrote:
> On Fri, Jan 10, 2025, Adrian Hunter wrote:
>> On 9/01/25 21:11, Sean Christopherson wrote:
>>> On Fri, Jan 03, 2025, Adrian Hunter wrote:
>>>> +static u64 tdx_guest_cr4(struct kvm_vcpu *vcpu)
>>>> +{
>>>> + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm);
>>>> + u64 cr4;
>>>> +
>>>> + rdmsrl(MSR_IA32_VMX_CR4_FIXED1, cr4);
>>>
>>> This won't be accurate long-term. E.g. run KVM on hardware with CR4 bits that
>>> neither KVM nor TDX know about, and vcpu->arch.cr4 will end up with bits set that
>>> KVM think are illegal, which will cause it's own problems.
>>
>> Currently validation of CR4 is only done when user space changes it,
>> which should not be allowed for TDX. For that it looks like TDX
>> would need:
>>
>> kvm->arch.has_protected_state = true;
>>
>> Not sure why it doesn't already?
>
> Sorry, I didn't follow any of that.
>
>>> For CR0 and CR4, we should be able to start with KVM's set of allowed bits, not
>>> the CPU's. That will mean there will likely be missing bits, in vcpu->arch.cr{0,4},
>>> but if KVM doesn't know about a bit, the fact that it's missing should be a complete
>>> non-issue.
>>
>> What about adding:
>>
>> cr4 &= ~cr4_reserved_bits;
>>
>> and
>>
>> cr0 &= ~CR0_RESERVED_BITS
>
> I was thinking a much more explicit:
>
> vcpu->arch.cr4 = ~vcpu->arch.cr4_guest_rsvd_bits;
>
> which if it's done in tdx_vcpu_init(), in conjunction with freezing the vCPU
> model (see below), should be solid.
>
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index d2ea7db896ba..f2b1980f830d 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -1240,6 +1240,11 @@ static int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr)
>>>> u64 old_xcr0 = vcpu->arch.xcr0;
>>>> u64 valid_bits;
>>>>
>>>> + if (vcpu->arch.guest_state_protected) {
>>>
>>> This should be a WARN_ON_ONCE() + return 1, no?
>>
>> With kvm->arch.has_protected_state = true, KVM_SET_XCRS
>> would fail, which would probably be fine except for KVM selftests:
>>
>> Currently the KVM selftests expect to be able to set XCR0:
>>
>> td_vcpu_add()
>> vm_vcpu_add()
>> vm_arch_vcpu_add()
>> vcpu_init_xcrs()
>> vcpu_xcrs_set()
>> vcpu_ioctl(KVM_SET_XCRS)
>> __TEST_ASSERT_VM_VCPU_IOCTL(!ret)
>>
>> Seems like vm->arch.has_protected_state is needed for KVM selftests?
>
> I doubt it's truly needed, my guess (without looking at the code) is that selftests
> are fudging around the fact that KVM doesn't stuff arch.xcr0.
Here is when it was added:
commit 8b14c4d85d031f7700fa4e042aebf99d933971f0
Author: Sean Christopherson <seanjc@...gle.com>
Date: Thu Oct 3 16:43:31 2024 -0700
KVM: selftests: Configure XCR0 to max supported value by default
To play nice with compilers generating AVX instructions, set CR4.OSXSAVE
and configure XCR0 by default when creating selftests vCPUs. Some distros
have switched gcc to '-march=x86-64-v3' by default, and while it's hard to
find a CPU which doesn't support AVX today, many KVM selftests fail with
Is below OK to avoid it?
diff --git a/tools/testing/selftests/kvm/include/x86_64/kvm_util_arch.h b/tools/testing/selftests/kvm/include/x86_64/kvm_util_arch.h
index 972bb1c4ab4c..42925152ed25 100644
--- a/tools/testing/selftests/kvm/include/x86_64/kvm_util_arch.h
+++ b/tools/testing/selftests/kvm/include/x86_64/kvm_util_arch.h
@@ -12,20 +12,21 @@ extern bool is_forced_emulation_enabled;
struct kvm_vm_arch {
vm_vaddr_t gdt;
vm_vaddr_t tss;
vm_vaddr_t idt;
uint64_t c_bit;
uint64_t s_bit;
int sev_fd;
bool is_pt_protected;
+ bool has_protected_sregs;
};
static inline bool __vm_arch_has_protected_memory(struct kvm_vm_arch *arch)
{
return arch->c_bit || arch->s_bit;
}
#define vm_arch_has_protected_memory(vm) \
__vm_arch_has_protected_memory(&(vm)->arch)
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index 0ed0768b1c88..89b70fe037d1 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -704,22 +704,25 @@ struct kvm_vcpu *vm_arch_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id)
*
* If this code is ever used to launch a vCPU with 32-bit entry point it
* may need to subtract 4 bytes instead of 8 bytes.
*/
TEST_ASSERT(IS_ALIGNED(stack_vaddr, PAGE_SIZE),
"__vm_vaddr_alloc() did not provide a page-aligned address");
stack_vaddr -= 8;
vcpu = __vm_vcpu_add(vm, vcpu_id);
vcpu_init_cpuid(vcpu, kvm_get_supported_cpuid());
- vcpu_init_sregs(vm, vcpu);
- vcpu_init_xcrs(vm, vcpu);
+
+ if (!vm->arch.has_protected_sregs) {
+ vcpu_init_sregs(vm, vcpu);
+ vcpu_init_xcrs(vm, vcpu);
+ }
vcpu->initial_stack_addr = stack_vaddr;
/* Setup guest general purpose registers */
vcpu_regs_get(vcpu, ®s);
regs.rflags = regs.rflags | 0x2;
regs.rsp = stack_vaddr;
vcpu_regs_set(vcpu, ®s);
/* Setup the MP state */
diff --git a/tools/testing/selftests/kvm/lib/x86_64/tdx/tdx_util.c b/tools/testing/selftests/kvm/lib/x86_64/tdx/tdx_util.c
index 16db4e97673e..da4bcfefdd70 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/tdx/tdx_util.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/tdx/tdx_util.c
@@ -477,25 +477,22 @@ static void load_td_boot_parameters(struct td_boot_parameters *params,
* entering the TD first time.
*
* Input Args:
* vm - Virtual Machine
* vcpuid - The id of the VCPU to add to the VM.
*/
struct kvm_vcpu *td_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id, void *guest_code)
{
struct kvm_vcpu *vcpu;
- /*
- * TD setup will not use the value of rip set in vm_vcpu_add anyway, so
- * NULL can be used for guest_code.
- */
- vcpu = vm_vcpu_add(vm, vcpu_id, NULL);
+ vm->arch.has_protected_sregs = true;
+ vcpu = vm_arch_vcpu_add(vm, vcpu_id);
tdx_td_vcpu_init(vcpu);
load_td_boot_parameters(addr_gpa2hva(vm, TD_BOOT_PARAMETERS_GPA),
vcpu, guest_code);
return vcpu;
}
/**
Powered by blists - more mailing lists