lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANRm+CzLvyWswEX1UDnESSLHO5xt2wPciL6b=TTr-ua7yKZSTA@mail.gmail.com>
Date:   Wed, 13 Nov 2019 14:05:59 +0800
From:   Wanpeng Li <kernellwp@...il.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     LKML <linux-kernel@...r.kernel.org>, kvm <kvm@...r.kernel.org>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>
Subject: Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath

On Tue, 12 Nov 2019 at 09:33, Wanpeng Li <kernellwp@...il.com> wrote:
>
> On Tue, 12 Nov 2019 at 05:59, Paolo Bonzini <pbonzini@...hat.com> wrote:
> >
> > On 09/11/19 08:05, Wanpeng Li wrote:
> > > From: Wanpeng Li <wanpengli@...cent.com>
> > >
> > > This patch tries to optimize x2apic physical destination mode, fixed delivery
> > > mode single target IPI by delivering IPI to receiver immediately after sender
> > > writes ICR vmexit to avoid various checks when possible.
> > >
> > > Testing on Xeon Skylake server:
> > >
> > > The virtual IPI latency from sender send to receiver receive reduces more than
> > > 330+ cpu cycles.
> > >
> > > Running hackbench(reschedule ipi) in the guest, the avg handle time of MSR_WRITE
> > > caused vmexit reduces more than 1000+ cpu cycles:
> > >
> > > Before patch:
> > >
> > >   VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time   Avg time
> > > MSR_WRITE    5417390    90.01%    16.31%      0.69us    159.60us    1.08us
> > >
> > > After patch:
> > >
> > >   VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time   Avg time
> > > MSR_WRITE    6726109    90.73%    62.18%      0.48us    191.27us    0.58us
> >
> > Do you have retpolines enabled?  The bulk of the speedup might come just
> > from the indirect jump.
>
> Adding 'mitigations=off' to the host grub parameter:
>
> Before patch:
>
>     VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time   Avg time
> MSR_WRITE    2681713    92.98%    77.52%      0.38us     18.54us
> 0.73us ( +-   0.02% )
>
> After patch:
>
>     VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time   Avg time
> MSR_WRITE    2953447    92.48%    62.47%      0.30us     59.09us
> 0.40us ( +-   0.02% )
>
> Actually, this is not the first attempt to add shortcut for MSR writes
> which performance sensitive, the other effort is tscdeadline timer
> from Isaku Yamahata, https://patchwork.kernel.org/cover/10541035/ ,
> ICR and TSCDEADLINE MSR writes cause the main MSR write vmexits in our
> product observation, multicast IPIs are not as common as unicast IPI
> like RESCHEDULE_VECTOR and CALL_FUNCTION_SINGLE_VECTOR etc. As far as
> I know, something similar to this patch has already been deployed in
> some cloud companies private kvm fork.

Hi Paolo,

Do you think I should continue for this?

    Wanpeng

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ