linux-kernel - Re: [PATCH 3/3] KVM: LAPIC: Optimize PMI delivering overhead

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANRm+Cy=bb_iap6JKsux7ekmo6Td0FXqwpuVdgPSC8u8b2wFNA@mail.gmail.com>
Date:   Fri, 8 Oct 2021 19:06:34 +0800
From:   Wanpeng Li <kernellwp@...il.com>
To:     Vitaly Kuznetsov <vkuznets@...hat.com>
Cc:     LKML <linux-kernel@...r.kernel.org>, kvm <kvm@...r.kernel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>
Subject: Re: [PATCH 3/3] KVM: LAPIC: Optimize PMI delivering overhead

On Fri, 8 Oct 2021 at 18:52, Vitaly Kuznetsov <vkuznets@...hat.com> wrote:
>
> Wanpeng Li <kernellwp@...il.com> writes:
>
> > From: Wanpeng Li <wanpengli@...cent.com>
> >
> > The overhead of kvm_vcpu_kick() is huge since expensive rcu/memory
> > barrier etc operations in rcuwait_wake_up(). It is worse when local
> > delivery since the vCPU is scheduled and we still suffer from this.
> > We can observe 12us+ for kvm_vcpu_kick() in kvm_pmu_deliver_pmi()
> > path by ftrace before the patch and 6us+ after the optimization.
> >
> > Signed-off-by: Wanpeng Li <wanpengli@...cent.com>
> > ---
> >  arch/x86/kvm/lapic.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 76fb00921203..ec6997187c6d 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -1120,7 +1120,8 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
> >       case APIC_DM_NMI:
> >               result = 1;
> >               kvm_inject_nmi(vcpu);
> > -             kvm_vcpu_kick(vcpu);
> > +             if (vcpu != kvm_get_running_vcpu())
> > +                     kvm_vcpu_kick(vcpu);
>
> Out of curiosity,
>
> can this be converted into a generic optimization for kvm_vcpu_kick()
> instead? I.e. if kvm_vcpu_kick() is called for the currently running
> vCPU, there's almost nothing to do, especially when we already have a
> request pending, right? (I didn't put too much though to it)

I thought about it before, I will do it in the next version since you
also vote for it. :)

    Wanpeng