[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161016060320-mutt-send-email-mst@kernel.org>
Date: Sun, 16 Oct 2016 06:21:41 +0300
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
rkrcmar@...hat.com, yang.zhang.wz@...il.com, feng.wu@...el.com
Subject: Re: [PATCH 1/5] KVM: x86: avoid atomic operations on APICv vmentry
On Fri, Oct 14, 2016 at 08:21:27PM +0200, Paolo Bonzini wrote:
> On some benchmarks (e.g. netperf with ioeventfd disabled), APICv
> posted interrupts turn out to be slower than interrupt injection via
> KVM_REQ_EVENT.
>
> This patch optimizes a bit the IRR update, avoiding expensive atomic
> operations in the common case where PI.ON=0 at vmentry or the PIR vector
> is mostly zero. This saves at least 20 cycles (1%) per vmexit, as
> measured by kvm-unit-tests' inl_from_qemu test (20 runs):
>
> | enable_apicv=1 | enable_apicv=0
> | mean stdev | mean stdev
> ----------|-----------------|------------------
> before | 5826 32.65 | 5765 47.09
> after | 5809 43.42 | 5777 77.02
>
> Of course, any change in the right column is just placebo effect. :)
> The savings are bigger if interrupts are frequent.
>
> Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
> ---
> arch/x86/kvm/lapic.c | 6 ++++--
> arch/x86/kvm/vmx.c | 9 ++++++++-
> 2 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 23b99f305382..63a442aefc12 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -342,9 +342,11 @@ void __kvm_apic_update_irr(u32 *pir, void *regs)
> u32 i, pir_val;
>
> for (i = 0; i <= 7; i++) {
> - pir_val = xchg(&pir[i], 0);
> - if (pir_val)
> + pir_val = READ_ONCE(pir[i]);
> + if (pir_val) {
> + pir_val = xchg(&pir[i], 0);
> *((u32 *)(regs + APIC_IRR + i * 0x10)) |= pir_val;
> + }
> }
> }
> EXPORT_SYMBOL_GPL(__kvm_apic_update_irr);
gcc doesn't seem to unroll this loop and it's
probably worth unrolling it
The following seems to do the trick for me on upstream - I didn't
benchmark it though. Is there a kvm unit test for interrupts?
--->
kvm: unroll the loop in __kvm_apic_update_irr.
This is hot data path in interrupt-rich workloads, worth unrolling.
Signed-off-by: Michael S. Tsirkin <mst@...hat.com>
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index b62c852..0c3462c 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -337,7 +337,8 @@ static u8 count_vectors(void *bitmap)
return count;
}
-void __kvm_apic_update_irr(u32 *pir, void *regs)
+void __attribute__((optimize("unroll-loops")))
+__kvm_apic_update_irr(u32 *pir, void *regs)
{
u32 i, pir_val;
Powered by blists - more mailing lists