[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BYAPR11MB3256B39E2A34A09FF64ECC5BA9B79@BYAPR11MB3256.namprd11.prod.outlook.com>
Date: Wed, 13 Oct 2021 06:15:56 +0000
From: "Liu, Jing2" <jing2.liu@...el.com>
To: Paolo Bonzini <pbonzini@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>
CC: "x86@...nel.org" <x86@...nel.org>,
"Bae, Chang Seok" <chang.seok.bae@...el.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"Arjan van de Ven" <arjan@...ux.intel.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"Nakajima, Jun" <jun.nakajima@...el.com>,
Jing Liu <jing2.liu@...ux.intel.com>,
"seanjc@...gle.com" <seanjc@...gle.com>
Subject: RE: [patch 13/31] x86/fpu: Move KVMs FPU swapping to FPU core
> On 12/10/21 02:00, Thomas Gleixner wrote:
> > Swapping the host/guest FPU is directly fiddling with FPU internals
> > which requires 5 exports. The upcoming support of dymanically enabled
> > states would even need more.
> >
> > Implement a swap function in the FPU core code and export that instead.
> >
> > Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
> > Cc: kvm@...r.kernel.org
> > Cc: Paolo Bonzini <pbonzini@...hat.com>
> > ---
> > arch/x86/include/asm/fpu/api.h | 8 +++++
> > arch/x86/include/asm/fpu/internal.h | 15 +---------
> > arch/x86/kernel/fpu/core.c | 30 ++++++++++++++++++---
> > arch/x86/kernel/fpu/init.c | 1
> > arch/x86/kernel/fpu/xstate.c | 1
> > arch/x86/kvm/x86.c | 51 +++++++-----------------------------
> > arch/x86/mm/extable.c | 2 -
> > 7 files changed, 48 insertions(+), 60 deletions(-)
> >
When looking into the tglx/devel.git x86/fpu for the full #1-#4
series and the KVM AMX support, I'd like to talk two things
as follows,
1. KVM dynamic allocation API:
Since KVM also uses dynamic allocation, after KVM detects guest
requesting AMX by #NM trap, KVM need alloc extra buffer for
this vcpu's current->thread.fpu.fpstate and guest_fpu related.
So far, the kernel itself has such API like fpstate_realloc(), but it's
static. How about making a common function usable for KVM?
2. There exists a case that *guest AMX state can be lost*:
After KVM passthrough XFD to guest, when vmexit opening
irq window and KVM is interrupted, kernel softirq path can call
kernel_fpu_begin() to touch xsave state. This function does
XSAVES. If guest XFD[18] is 1, and with guest AMX state in register,
then guest AMX state is lost by XSAVES.
The detailed example call trace in commit
commit 2620fe268e80d667a94553cd37a94ccaa2cb8c83
Author: Sean Christopherson <seanjc@...gle.com>
Date: Fri Jan 17 11:30:51 2020 -0800
KVM: x86: Revert "KVM: X86: Fix fpu state crash in kvm guest"
Reload the current thread's FPU state, which contains the guest's FPU
state, to the CPU registers if necessary during vcpu_enter_guest().
TIF_NEED_FPU_LOAD can be set any time control is transferred out of
KVM,
e.g. if I/O is triggered during a KVM call to get_user_pages() or if a
softirq occurs while KVM is scheduled in.
...
A sample trace triggered by warning if TIF_NEED_FPU_LOAD is set while
vcpu state is loaded:
<IRQ>
gcmaes_crypt_by_sg.constprop.12+0x26e/0x660
? 0xffffffffc024547d
? __qdisc_run+0x83/0x510
? __dev_queue_xmit+0x45e/0x990
...
? do_IRQ+0x7f/0xd0
? common_interrupt+0xf/0xf
</IRQ>
? irq_entries_start+0x20/0x660
? vmx_get_interrupt_shadow+0x2f0/0x710 [kvm_intel]
? kvm_set_msr_common+0xfc7/0x2380 [kvm]
? recalibrate_cpu_khz+0x10/0x10
? ktime_get+0x3a/0xa0
? kvm_arch_vcpu_ioctl_run+0x107/0x560 [kvm]
? kvm_init+0x6bf/0xd00 [kvm]
For this case, I think one way is kernel doing something before XSAVES
for KVM thread; another way is let KVM fix: maintaining a zero XFD
value (by current->state.fpu.fpstate->xfd = 0) after vcpu fpu state is
loaded and restore real guest XFD value before vmenter.
Logic as follows.
after vmexit:
if XFD is passthrough
then
sync guest XFD to vmx->xfd;
set XFD to current->state.fpu.fpstate->xfd (= 0)
__this_cpu_write(xfd_state, 0);
before vmenter (irq is disabled):
if passthrough
then
restore to real guest XFD by vmx->xfd;
vcpu_run: (if XFD is passthrough)
load: swap from qemu's to a zero XFD
put: swap zero to qemu's
Thanks,
Jing
[...]
Powered by blists - more mailing lists