[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9d336622-6964-454a-605f-1ca90b902836@redhat.com>
Date: Tue, 7 Jun 2022 14:54:14 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: Peter Xu <peterx@...hat.com>
Cc: Sasha Levin <sashal@...nel.org>, linux-kernel@...r.kernel.org,
stable@...r.kernel.org, Leonardo Bras <leobras@...hat.com>,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org,
chang.seok.bae@...el.com, luto@...nel.org, kvm@...r.kernel.org,
Sean Christopherson <seanjc@...gle.com>
Subject: Re: [PATCH AUTOSEL 5.16 07/28] x86/kvm/fpu: Limit guest
user_xfeatures to supported bits of XCR0
On 6/6/22 23:27, Peter Xu wrote:
> On Mon, Jun 06, 2022 at 06:18:12PM +0200, Paolo Bonzini wrote:
>>> However there seems to be something missing at least to me, on why it'll
>>> fail a migration from 5.15 (without this patch) to 5.18 (with this patch).
>>> In my test case, user_xfeatures will be 0x7 (FP|SSE|YMM) if without this
>>> patch, but 0x0 if with it.
>>
>> What CPU model are you using for the VM?
>
> I didn't specify it, assuming it's qemu64 with no extra parameters.
Ok, so indeed it lacks AVX and this patch can have an effect.
>> For example, if the source lacks this patch but the destination has it,
>> the source will transmit YMM registers, but the destination will fail to
>> set them if they are not available for the selected CPU model.
>>
>> See the commit message: "As a bonus, it will also fail if userspace tries to
>> set fpu features (with the KVM_SET_XSAVE ioctl) that are not compatible to
>> the guest configuration. Such features will never be returned by
>> KVM_GET_XSAVE or KVM_GET_XSAVE2."
>
> IIUC you meant we should have failed KVM_SET_XSAVE when they're not aligned
> (probably by failing validate_user_xstate_header when checking against the
> user_xfeatures on dest host). But that's probably not my case, because here
> KVM_SET_XSAVE succeeded, it's just that the guest gets a double fault after
> the precopy migration completes (or for postcopy when the switchover is
> done).
Difficult to say what's happening without seeing at least the guest code
around the double fault (above you said "fail a migration" and I thought
that was a different scenario than the double fault), and possibly which
was the first exception that contributed to the double fault.
Paolo
Powered by blists - more mailing lists