linux-kernel - Re: [patch 13/31] x86/fpu: Move KVMs FPU swapping to FPU core

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <da47ba42-b61e-d236-2c1c-9c5504e48091@redhat.com>
Date:   Wed, 13 Oct 2021 10:42:53 +0200
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     "Liu, Jing2" <jing2.liu@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>
Cc:     "x86@...nel.org" <x86@...nel.org>,
        "Bae, Chang Seok" <chang.seok.bae@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Arjan van de Ven <arjan@...ux.intel.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "Nakajima, Jun" <jun.nakajima@...el.com>,
        Jing Liu <jing2.liu@...ux.intel.com>,
        "seanjc@...gle.com" <seanjc@...gle.com>
Subject: Re: [patch 13/31] x86/fpu: Move KVMs FPU swapping to FPU core

On 13/10/21 09:46, Liu, Jing2 wrote:
> 
>> On 13/10/21 08:15, Liu, Jing2 wrote:
>>> After KVM passthrough XFD to guest, when vmexit opening irq window and
>>> KVM is interrupted, kernel softirq path can call
>>> kernel_fpu_begin() to touch xsave state. This function does XSAVES. If
>>> guest XFD[18] is 1, and with guest AMX state in register, then guest
>>> AMX state is lost by XSAVES.
>>
>> Yes, the host value of XFD (which is zero) has to be restored after vmexit.
>> See how KVM already handles SPEC_CTRL.
> 
> I'm trying to understand why qemu's XFD is zero after kernel supports AMX.

There are three copies of XFD:

- the guest value stored in vcpu->arch.

- the "QEMU" value attached to host_fpu.  This one only becomes zero if 
QEMU requires AMX (which shouldn't happen).

- the internal KVM value attached to guest_fpu.  When #NM happens, this 
one becomes zero.


The CPU value is:

- the host_fpu value before kvm_load_guest_fpu and after 
kvm_put_guest_fpu.  This ensures that QEMU context switch is as cheap as 
possible.

- the guest_fpu value between kvm_load_guest_fpu and kvm_put_guest_fpu. 
  This ensures that no state is lost in the case you are describing.

- the OR of the guest value and the guest_fpu value while the guest runs 
(using either MSR load/save lists, or manual wrmsr like 
pt_guest_enter/pt_guest_exit).  This ensures that the host has the 
opportunity to get a #NM exception, and allocate AMX state in the 
guest_fpu and in current->thread.fpu.

> Yes, passthrough is done by two cases: one is guest #NM trapped;
> another is guest clearing XFD before it generates #NM (this is possible for
> guest), then passthrough.
> For the two cases, we passthrough and allocate buffer for guest_fpu, and
> current->thread.fpu.

I think it's simpler to always wait for #NM, it will only happen once 
per vCPU.  In other words, even if the guest clears XFD before it 
generates #NM, the guest_fpu's XFD remains nonzero and an #NM vmexit is 
possible.  After #NM the guest_fpu's XFD is zero; then passthrough can 
happen and the #NM vmexit trap can be disabled.

Paolo