[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <26ea7039-3186-c23f-daba-d039bb8d6f48@redhat.com>
Date: Fri, 10 Dec 2021 23:13:29 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: Yang Zhong <yang.zhong@...el.com>, x86@...nel.org,
kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com
Cc: seanjc@...gle.com, jun.nakajima@...el.com, kevin.tian@...el.com,
jing2.liu@...ux.intel.com, jing2.liu@...el.com
Subject: Re: [PATCH 16/19] kvm: x86: Introduce KVM_{G|S}ET_XSAVE2 ioctl
On 12/10/21 17:30, Paolo Bonzini wrote:
>>
>> +static int kvm_vcpu_ioctl_x86_set_xsave2(struct kvm_vcpu *vcpu, u8
>> *state)
>> +{
>> + if (fpstate_is_confidential(&vcpu->arch.guest_fpu))
>> + return 0;
>> +
>> + return fpu_copy_uabi_to_guest_fpstate(&vcpu->arch.guest_fpu, state,
>> + supported_xcr0, &vcpu->arch.pkru);
>> +}
>> +
>
> I think fpu_copy_uabi_to_guest_fpstate (and therefore
> copy_uabi_from_kernel_to_xstate) needs to check that the size is
> compatible with the components in the input.
>
> Also, IIUC the size of the AMX state will vary in different processors.
> Is this correct? If so, this should be handled already by
> KVM_GET/SET_XSAVE2 and therefore should be part of the
> arch/x86/kernel/fpu APIs. In the future we want to support migrating a
> "small AMX" host to a "large AMX" host; and also migrating from a "large
> AMX" host to a "small AMX" host if the guest CPUID is compatible with
> the destination of the migration.
So, the size of the AMX state will depend on the active "palette" in
TILECONFIG, and on the CPUID information. I have a few questions on how
Intel intends to handle future extensions to AMX:
- can we assume that, in the future, palette 1 will always have the same
value (bytes_per_row=64, max_names=8, max_rows=16), and basically that
the only variable value is really the number of palettes?
- how does Intel plan to handle bigger TILEDATA? Will it use more XCR0
bits or will it rather enlarge save state 18?
If it will use more XCR0 bits, I suppose that XCR0 bits will control
which palettes can be chosen by LDTILECFG.
If not, on the other hand, this will be a first case of one system's
XSAVE data not being XRSTOR-able on another system even if the
destination system can set XCR0 to the same value as the source system.
Likewise, if the size and offsets for save state 18 were to vary
depending on the selected palette, then this would be novel, in that the
save state size and offsets would not be in CPUID anymore. It would be
particularly interesting for non-compacted format, where all save states
after 18 would also move forward.
So, I hope that save state 18 will be frozen to 8k. In that case, and
if palette 1 is frozen to the same values as today, implementing
migration will not be a problem; it will be essentially the same as
SSE->AVX (horizontal extension of existing registers) and/or AVX->AVX512
(both horizontal and vertical extension).
By the way, I think KVM_SET_XSAVE2 is not needed. Instead:
- KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2) should return the size of the
buffer that is passed to KVM_GET_XSAVE2
- KVM_GET_XSAVE2 should fill in the buffer expecting that its size is
whatever KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2) passes
- KVM_SET_XSAVE can just expect a buffer that is bigger than 4k if the
save states recorded in the header point to offsets larger than 4k.
Paolo
Powered by blists - more mailing lists