[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANRm+CzK_h2E9XWFipkNpAALLCBcM2vrUkdBpumwmT9AP09hfA@mail.gmail.com>
Date: Tue, 12 Nov 2019 09:33:49 +0800
From: Wanpeng Li <kernellwp@...il.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: LKML <linux-kernel@...r.kernel.org>, kvm <kvm@...r.kernel.org>,
Radim Krčmář <rkrcmar@...hat.com>,
Sean Christopherson <sean.j.christopherson@...el.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>
Subject: Re: [PATCH 1/2] KVM: X86: Single target IPI fastpath
On Tue, 12 Nov 2019 at 05:59, Paolo Bonzini <pbonzini@...hat.com> wrote:
>
> On 09/11/19 08:05, Wanpeng Li wrote:
> > From: Wanpeng Li <wanpengli@...cent.com>
> >
> > This patch tries to optimize x2apic physical destination mode, fixed delivery
> > mode single target IPI by delivering IPI to receiver immediately after sender
> > writes ICR vmexit to avoid various checks when possible.
> >
> > Testing on Xeon Skylake server:
> >
> > The virtual IPI latency from sender send to receiver receive reduces more than
> > 330+ cpu cycles.
> >
> > Running hackbench(reschedule ipi) in the guest, the avg handle time of MSR_WRITE
> > caused vmexit reduces more than 1000+ cpu cycles:
> >
> > Before patch:
> >
> > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
> > MSR_WRITE 5417390 90.01% 16.31% 0.69us 159.60us 1.08us
> >
> > After patch:
> >
> > VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
> > MSR_WRITE 6726109 90.73% 62.18% 0.48us 191.27us 0.58us
>
> Do you have retpolines enabled? The bulk of the speedup might come just
> from the indirect jump.
Adding 'mitigations=off' to the host grub parameter:
Before patch:
VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
MSR_WRITE 2681713 92.98% 77.52% 0.38us 18.54us
0.73us ( +- 0.02% )
After patch:
VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
MSR_WRITE 2953447 92.48% 62.47% 0.30us 59.09us
0.40us ( +- 0.02% )
Actually, this is not the first attempt to add shortcut for MSR writes
which performance sensitive, the other effort is tscdeadline timer
from Isaku Yamahata, https://patchwork.kernel.org/cover/10541035/ ,
ICR and TSCDEADLINE MSR writes cause the main MSR write vmexits in our
product observation, multicast IPIs are not as common as unicast IPI
like RESCHEDULE_VECTOR and CALL_FUNCTION_SINGLE_VECTOR etc. As far as
I know, something similar to this patch has already been deployed in
some cloud companies private kvm fork.
Wanpeng
Powered by blists - more mailing lists