[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABgObfa5ViBjb_BnmKqf0+7M6rZ5-M+yOw_7tVK_Ek6tp21Z=w@mail.gmail.com>
Date: Tue, 30 Dec 2025 00:31:55 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org, x86@...nel.org,
stable@...r.kernel.org
Subject: Re: [PATCH 2/5] x86, fpu: separate fpstate->xfd and guest XFD
On Mon, Dec 29, 2025 at 11:45 PM Sean Christopherson <seanjc@...gle.com> wrote:
> The fix works only because the userspace XFD[18] must be '0' since the kernel
> never re-disables XFD features after they are enabled.
Yes, this is why I considered XFD[18]=1 to be a bug.
> Which is probably fine
> in practice since re-disabling a component for a guest task would need to force
> the guest FPU back into an init state as well, but I don't love the complexity.
>
> This also creates a nasty, subtle asymmetry in KVM's ABI.
I find this second argument more convincing; I preferred the
complexity of an extra field to track guest xfd, to having to patch
xstate_bv.
(Initially I was worried also about mismatches between xstate_bv and
xcomp_bv but those are find; When the compacted format is in use
xcomp_bv[62:0] is simply EDX:EAX & XCR0).
> Lastly, the fix is effectively papering over another bug, which I'm pretty sure
> is the underlying issue that was originally encountered.
Yes, I agree this is the most likely scenario. Whether it's papering
over it or otherwise, it depends on what you consider the invariants
to be.
> So, given that KVM's effective ABI is to record XSTATE_BV[i]=0 if XFD[i]==1, I
> vote to fix this by emulating that behavior when stuffing XFD in
> fpu_update_guest_xfd(), and then manually closing the hole Paolo found in
> fpu_copy_uabi_to_guest_fpstate().
I disagree with changing the argument from const void* to void*.
Let's instead treat it as a KVM backwards-compatibility quirk:
union fpregs_state *xstate =
(union fpregs_state *)guest_xsave->region;
xstate->xsave.header.xfeatures &=
~vcpu->arch.guest_fpu.fpstate->xfd;
It keeps the kernel/ API const as expected and if anything I'd
consider adding a WARN to fpu_copy_uabi_to_guest_fpstate(), basically
asserting that there would be no #NM on the subsequent restore.
> @@ -319,10 +319,25 @@ EXPORT_SYMBOL_FOR_KVM(fpu_enable_guest_xfd_features);
> #ifdef CONFIG_X86_64
> void fpu_update_guest_xfd(struct fpu_guest *guest_fpu, u64 xfd)
> {
> + struct fpstate *fpstate = guest_fpu->fpstate;
> +
> fpregs_lock();
> - guest_fpu->fpstate->xfd = xfd;
> - if (guest_fpu->fpstate->in_use)
> - xfd_update_state(guest_fpu->fpstate);
> + fpstate->xfd = xfd;
> + if (fpstate->in_use)
> + xfd_update_state(fpstate);
> +
> + /*
> + * If the guest's FPU state is NOT resident in hardware, clear disabled
> + * components in XSTATE_BV as attempting to load disabled components
> + * will generate #NM _in the host_, and KVM's ABI is that saving guest
> + * XSAVE state should see XSTATE_BV[i]=0 if XFD[i]=1.
> + *
> + * If the guest's FPU state is in hardware, simply do nothing as XSAVE
> + * itself saves XSTATE_BV[i] as 0 if XFD[i]=1.
s/saves/(from fpu_swap_kvm_fpstate) will save/
> + */
> + if (xfd && test_thread_flag(TIF_NEED_FPU_LOAD))
> + fpstate->regs.xsave.header.xfeatures &= ~xfd;
No objections to this part. I'll play with this to adjust the
selftests either tomorrow or, more likely, on January 2nd, and send a
v2 that also includes the change from preemption_disabled to
irqs_disabled.
I take it that you don't have any qualms with the new
fpu_load_guest_fpstate function, but let me know if you prefer to have
it in a separate submission destined to 6.20 only.
Paolo
Powered by blists - more mailing lists