linux-kernel - Re: [PATCH v3 4/4] LoongArch: KVM: Add FPU delay load support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAhV-H5bs5E4pMbMaTX+AZ9UwmSB81q0ga+CjM+ivY4kWVp2eQ@mail.gmail.com>
Date: Tue, 3 Feb 2026 17:17:29 +0800
From: Huacai Chen <chenhuacai@...nel.org>
To: Bibo Mao <maobibo@...ngson.cn>
Cc: WANG Xuerui <kernel@...0n.name>, Tianrui Zhao <zhaotianrui@...ngson.cn>, 
	loongarch@...ts.linux.dev, linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH v3 4/4] LoongArch: KVM: Add FPU delay load support

On Tue, Feb 3, 2026 at 4:59 PM Bibo Mao <maobibo@...ngson.cn> wrote:
>
>
>
> On 2026/2/3 下午4:50, Huacai Chen wrote:
> > On Tue, Feb 3, 2026 at 3:51 PM Bibo Mao <maobibo@...ngson.cn> wrote:
> >>
> >>
> >>
> >> On 2026/2/3 下午3:34, Huacai Chen wrote:
> >>> On Tue, Feb 3, 2026 at 2:48 PM Bibo Mao <maobibo@...ngson.cn> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 2026/2/3 下午12:15, Huacai Chen wrote:
> >>>>> Hi, Bibo,
> >>>>>
> >>>>> On Tue, Feb 3, 2026 at 11:31 AM Bibo Mao <maobibo@...ngson.cn> wrote:
> >>>>>>
> >>>>>> FPU is lazy enabled with KVM hypervisor. After FPU is enabled and
> >>>>>> loaded, vCPU can be preempted and FPU will be lost again, there will
> >>>>>> be unnecessary FPU exception, load and store process. Here FPU is
> >>>>>> delay load until guest enter entry.
> >>>>> Calling LSX/LASX as FPU is a little strange, but somewhat reasonable.
> >>>>> Calling LBT as FPU is very strange. So I still like the V1 logic.
> >>>> yeap, LBT can use another different BIT and separate with FPU. It is
> >>>> actually normal use one bit + fpu type variant to represent different
> >>>> different FPU load requirement, such as
> >>>> TIF_FOREIGN_FPSTATE/TIF_NEED_FPU_LOAD on other architectures.
> >>>>
> >>>> I think it is better to put int fpu_load_type in structure loongarch_fpu.
> >>>>
> >>>> And there will be another optimization to avoid load FPU again if FPU HW
> >>>> is owned by current thread/vCPU, that will add last_cpu int type in
> >>>> structure loongarch_fpu also.
> >>>>
> >>>> Regards
> >>>> Bibo Mao
> >>>>>
> >>>>> If you insist on this version, please rename KVM_REQ_FPU_LOAD to
> >>>>> KVM_REQ_AUX_LOAD and rename fpu_load_type to aux_type, which is
> >>>>> similar to aux_inuse.
> >>> Then why not consider this?
> >> this can work now. However there is two different structure struct
> >> loongarch_fpu and struct loongarch_lbt.
> > Yes, but two structures don't block us from using KVM_REQ_AUX_LOAD and
> > aux_type to abstract both FPU and LBT, which is similar to aux_inuse.
> >>
> >> 1. If kernel wants to use late FPU load, new element fpu_load_type can
> >> be added in struct loongarch_fpu for both user app/KVM.
> where aux_type is put for kernel/kvm? Put it in thread structure with
> kernel late FPU load and vcpu.arch with KVM late FPU load?
aux_type is renamed from fpu_load_type, so where fpu_load_type is,
then where aux_type is.

> >>
> >> 2. With further optimization, FPU HW can own by user app/kernel/KVM,
> >> there will be another last_cpu int type added in struct loongarch_fpu.
> > Both loongarch_fpu and loongarch_lbt are register copies, so adding
> > fpu_load_type/last_cpu is not a good idea.
> If vCPU using FPU is preempted by kernel thread and kernel thread does
> not use FPU, HW FPU is the same with SW FPU state, HW FPU load can be
> skipped.
>
> BTW do you ever investigate FPU load/save process on other general
> architectures except MIPS?
I investigate nothing, including MIPS. Other architectures may give us
some inspiration, but that doesn't mean we should copy them, no matter
X86 or MIPS.

X86 introduced lazy fpu, then others also use lazy fpu; but now X86
have switched to eager fpu, others should also do the same?

On the other hand, when you use separate FPU/LSX/LASX, I only mention
the trace functions. Then you changed to centralized FPU/LSX/LASX/LBT.
Then I suggest you improve centralized FPU/LSX/LASX/LBT, you changed
to separate FPU/LBT again, where is the end?



Huacai
>
> Regards
> Bibo Mao
> >
> >
> > Huacai
> >>
> >> Regards
> >> Bibo Mao
> >>
> >> Regards
> >> Bibo Mao
> >>
> >>>
> >>> Huacai
> >>>
> >>>>>
> >>>>> Huacai
> >>>>>
> >>>>>>
> >>>>>> Signed-off-by: Bibo Mao <maobibo@...ngson.cn>
> >>>>>> ---
> >>>>>>     arch/loongarch/include/asm/kvm_host.h |  2 ++
> >>>>>>     arch/loongarch/kvm/exit.c             | 21 ++++++++++-----
> >>>>>>     arch/loongarch/kvm/vcpu.c             | 37 ++++++++++++++++++---------
> >>>>>>     3 files changed, 41 insertions(+), 19 deletions(-)
> >>>>>>
> >>>>>> diff --git a/arch/loongarch/include/asm/kvm_host.h b/arch/loongarch/include/asm/kvm_host.h
> >>>>>> index e4fe5b8e8149..902ff7bc0e35 100644
> >>>>>> --- a/arch/loongarch/include/asm/kvm_host.h
> >>>>>> +++ b/arch/loongarch/include/asm/kvm_host.h
> >>>>>> @@ -37,6 +37,7 @@
> >>>>>>     #define KVM_REQ_TLB_FLUSH_GPA          KVM_ARCH_REQ(0)
> >>>>>>     #define KVM_REQ_STEAL_UPDATE           KVM_ARCH_REQ(1)
> >>>>>>     #define KVM_REQ_PMU                    KVM_ARCH_REQ(2)
> >>>>>> +#define KVM_REQ_FPU_LOAD               KVM_ARCH_REQ(3)
> >>>>>>
> >>>>>>     #define KVM_GUESTDBG_SW_BP_MASK                \
> >>>>>>            (KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP)
> >>>>>> @@ -234,6 +235,7 @@ struct kvm_vcpu_arch {
> >>>>>>            u64 vpid;
> >>>>>>            gpa_t flush_gpa;
> >>>>>>
> >>>>>> +       int fpu_load_type;
> >>>>>>            /* Frequency of stable timer in Hz */
> >>>>>>            u64 timer_mhz;
> >>>>>>            ktime_t expire;
> >>>>>> diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c
> >>>>>> index 65ec10a7245a..62403c7c6f9a 100644
> >>>>>> --- a/arch/loongarch/kvm/exit.c
> >>>>>> +++ b/arch/loongarch/kvm/exit.c
> >>>>>> @@ -754,7 +754,8 @@ static int kvm_handle_fpu_disabled(struct kvm_vcpu *vcpu, int ecode)
> >>>>>>                    return RESUME_HOST;
> >>>>>>            }
> >>>>>>
> >>>>>> -       kvm_own_fpu(vcpu);
> >>>>>> +       vcpu->arch.fpu_load_type = KVM_LARCH_FPU;
> >>>>>> +       kvm_make_request(KVM_REQ_FPU_LOAD, vcpu);
> >>>>>>
> >>>>>>            return RESUME_GUEST;
> >>>>>>     }
> >>>>>> @@ -794,8 +795,10 @@ static int kvm_handle_lsx_disabled(struct kvm_vcpu *vcpu, int ecode)
> >>>>>>     {
> >>>>>>            if (!kvm_guest_has_lsx(&vcpu->arch))
> >>>>>>                    kvm_queue_exception(vcpu, EXCCODE_INE, 0);
> >>>>>> -       else
> >>>>>> -               kvm_own_lsx(vcpu);
> >>>>>> +       else {
> >>>>>> +               vcpu->arch.fpu_load_type = KVM_LARCH_LSX;
> >>>>>> +               kvm_make_request(KVM_REQ_FPU_LOAD, vcpu);
> >>>>>> +       }
> >>>>>>
> >>>>>>            return RESUME_GUEST;
> >>>>>>     }
> >>>>>> @@ -812,8 +815,10 @@ static int kvm_handle_lasx_disabled(struct kvm_vcpu *vcpu, int ecode)
> >>>>>>     {
> >>>>>>            if (!kvm_guest_has_lasx(&vcpu->arch))
> >>>>>>                    kvm_queue_exception(vcpu, EXCCODE_INE, 0);
> >>>>>> -       else
> >>>>>> -               kvm_own_lasx(vcpu);
> >>>>>> +       else {
> >>>>>> +               vcpu->arch.fpu_load_type = KVM_LARCH_LASX;
> >>>>>> +               kvm_make_request(KVM_REQ_FPU_LOAD, vcpu);
> >>>>>> +       }
> >>>>>>
> >>>>>>            return RESUME_GUEST;
> >>>>>>     }
> >>>>>> @@ -822,8 +827,10 @@ static int kvm_handle_lbt_disabled(struct kvm_vcpu *vcpu, int ecode)
> >>>>>>     {
> >>>>>>            if (!kvm_guest_has_lbt(&vcpu->arch))
> >>>>>>                    kvm_queue_exception(vcpu, EXCCODE_INE, 0);
> >>>>>> -       else
> >>>>>> -               kvm_own_lbt(vcpu);
> >>>>>> +       else {
> >>>>>> +               vcpu->arch.fpu_load_type = KVM_LARCH_LBT;
> >>>>>> +               kvm_make_request(KVM_REQ_FPU_LOAD, vcpu);
> >>>>>> +       }
> >>>>>>
> >>>>>>            return RESUME_GUEST;
> >>>>>>     }
> >>>>>> diff --git a/arch/loongarch/kvm/vcpu.c b/arch/loongarch/kvm/vcpu.c
> >>>>>> index 995461d724b5..d05fe6c8f456 100644
> >>>>>> --- a/arch/loongarch/kvm/vcpu.c
> >>>>>> +++ b/arch/loongarch/kvm/vcpu.c
> >>>>>> @@ -232,6 +232,31 @@ static void kvm_late_check_requests(struct kvm_vcpu *vcpu)
> >>>>>>                            kvm_flush_tlb_gpa(vcpu, vcpu->arch.flush_gpa);
> >>>>>>                            vcpu->arch.flush_gpa = INVALID_GPA;
> >>>>>>                    }
> >>>>>> +
> >>>>>> +       if (kvm_check_request(KVM_REQ_FPU_LOAD, vcpu)) {
> >>>>>> +               switch (vcpu->arch.fpu_load_type) {
> >>>>>> +               case KVM_LARCH_FPU:
> >>>>>> +                       kvm_own_fpu(vcpu);
> >>>>>> +                       break;
> >>>>>> +
> >>>>>> +               case KVM_LARCH_LSX:
> >>>>>> +                       kvm_own_lsx(vcpu);
> >>>>>> +                       break;
> >>>>>> +
> >>>>>> +               case KVM_LARCH_LASX:
> >>>>>> +                       kvm_own_lasx(vcpu);
> >>>>>> +                       break;
> >>>>>> +
> >>>>>> +               case KVM_LARCH_LBT:
> >>>>>> +                       kvm_own_lbt(vcpu);
> >>>>>> +                       break;
> >>>>>> +
> >>>>>> +               default:
> >>>>>> +                       break;
> >>>>>> +               }
> >>>>>> +
> >>>>>> +               vcpu->arch.fpu_load_type = 0;
> >>>>>> +       }
> >>>>>>     }
> >>>>>>
> >>>>>>     /*
> >>>>>> @@ -1286,13 +1311,11 @@ int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu)
> >>>>>>     #ifdef CONFIG_CPU_HAS_LBT
> >>>>>>     int kvm_own_lbt(struct kvm_vcpu *vcpu)
> >>>>>>     {
> >>>>>> -       preempt_disable();
> >>>>>>            if (!(vcpu->arch.aux_inuse & KVM_LARCH_LBT)) {
> >>>>>>                    set_csr_euen(CSR_EUEN_LBTEN);
> >>>>>>                    _restore_lbt(&vcpu->arch.lbt);
> >>>>>>                    vcpu->arch.aux_inuse |= KVM_LARCH_LBT;
> >>>>>>            }
> >>>>>> -       preempt_enable();
> >>>>>>
> >>>>>>            return 0;
> >>>>>>     }
> >>>>>> @@ -1335,8 +1358,6 @@ static inline void kvm_check_fcsr_alive(struct kvm_vcpu *vcpu) { }
> >>>>>>     /* Enable FPU and restore context */
> >>>>>>     void kvm_own_fpu(struct kvm_vcpu *vcpu)
> >>>>>>     {
> >>>>>> -       preempt_disable();
> >>>>>> -
> >>>>>>            /*
> >>>>>>             * Enable FPU for guest
> >>>>>>             * Set FR and FRE according to guest context
> >>>>>> @@ -1347,16 +1368,12 @@ void kvm_own_fpu(struct kvm_vcpu *vcpu)
> >>>>>>            kvm_restore_fpu(&vcpu->arch.fpu);
> >>>>>>            vcpu->arch.aux_inuse |= KVM_LARCH_FPU;
> >>>>>>            trace_kvm_aux(vcpu, KVM_TRACE_AUX_RESTORE, KVM_TRACE_AUX_FPU);
> >>>>>> -
> >>>>>> -       preempt_enable();
> >>>>>>     }
> >>>>>>
> >>>>>>     #ifdef CONFIG_CPU_HAS_LSX
> >>>>>>     /* Enable LSX and restore context */
> >>>>>>     int kvm_own_lsx(struct kvm_vcpu *vcpu)
> >>>>>>     {
> >>>>>> -       preempt_disable();
> >>>>>> -
> >>>>>>            /* Enable LSX for guest */
> >>>>>>            kvm_check_fcsr(vcpu, vcpu->arch.fpu.fcsr);
> >>>>>>            set_csr_euen(CSR_EUEN_LSXEN | CSR_EUEN_FPEN);
> >>>>>> @@ -1378,7 +1395,6 @@ int kvm_own_lsx(struct kvm_vcpu *vcpu)
> >>>>>>
> >>>>>>            trace_kvm_aux(vcpu, KVM_TRACE_AUX_RESTORE, KVM_TRACE_AUX_LSX);
> >>>>>>            vcpu->arch.aux_inuse |= KVM_LARCH_LSX | KVM_LARCH_FPU;
> >>>>>> -       preempt_enable();
> >>>>>>
> >>>>>>            return 0;
> >>>>>>     }
> >>>>>> @@ -1388,8 +1404,6 @@ int kvm_own_lsx(struct kvm_vcpu *vcpu)
> >>>>>>     /* Enable LASX and restore context */
> >>>>>>     int kvm_own_lasx(struct kvm_vcpu *vcpu)
> >>>>>>     {
> >>>>>> -       preempt_disable();
> >>>>>> -
> >>>>>>            kvm_check_fcsr(vcpu, vcpu->arch.fpu.fcsr);
> >>>>>>            set_csr_euen(CSR_EUEN_FPEN | CSR_EUEN_LSXEN | CSR_EUEN_LASXEN);
> >>>>>>            switch (vcpu->arch.aux_inuse & (KVM_LARCH_FPU | KVM_LARCH_LSX)) {
> >>>>>> @@ -1411,7 +1425,6 @@ int kvm_own_lasx(struct kvm_vcpu *vcpu)
> >>>>>>
> >>>>>>            trace_kvm_aux(vcpu, KVM_TRACE_AUX_RESTORE, KVM_TRACE_AUX_LASX);
> >>>>>>            vcpu->arch.aux_inuse |= KVM_LARCH_LASX | KVM_LARCH_LSX | KVM_LARCH_FPU;
> >>>>>> -       preempt_enable();
> >>>>>>
> >>>>>>            return 0;
> >>>>>>     }
> >>>>>> --
> >>>>>> 2.39.3
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>