[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM5PR21MB013707183D9E271E60FBD435D7650@DM5PR21MB0137.namprd21.prod.outlook.com>
Date: Fri, 25 Oct 2019 17:16:10 +0000
From: Michael Kelley <mikelley@...rosoft.com>
To: vkuznets <vkuznets@...hat.com>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>,
KY Srinivasan <kys@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Sasha Levin <sashal@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>,
Roman Kagan <rkagan@...tuozzo.com>,
Joe Perches <joe@...ches.com>
Subject: RE: [PATCH v2] x86/hyper-v: micro-optimize send_ipi_one case
From: Vitaly Kuznetsov <vkuznets@...hat.com>
>
> When sending an IPI to a single CPU there is no need to deal with cpumasks.
> With 2 CPU guest on WS2019 I'm seeing a minor (like 3%, 8043 -> 7761 CPU
> cycles) improvement with smp_call_function_single() loop benchmark. The
> optimization, however, is tiny and straitforward. Also, send_ipi_one() is
> important for PV spinlock kick.
>
> I was also wondering if it would make sense to switch to using regular
> APIC IPI send for CPU > 64 case but no, it is twice as expesive (12650 CPU
> cycles for __send_ipi_mask_ex() call, 26000 for orig_apic.send_IPI(cpu,
> vector)).
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@...hat.com>
> ---
> Changes since v1:
> - Style changes [Roman, Joe]
> ---
> arch/x86/hyperv/hv_apic.c | 13 ++++++++++---
> arch/x86/include/asm/trace/hyperv.h | 15 +++++++++++++++
> 2 files changed, 25 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
> index e01078e93dd3..fd17c6341737 100644
> --- a/arch/x86/hyperv/hv_apic.c
> +++ b/arch/x86/hyperv/hv_apic.c
> @@ -194,10 +194,17 @@ static bool __send_ipi_mask(const struct cpumask *mask, int
> vector)
>
> static bool __send_ipi_one(int cpu, int vector)
> {
> - struct cpumask mask = CPU_MASK_NONE;
> + trace_hyperv_send_ipi_one(cpu, vector);
>
> - cpumask_set_cpu(cpu, &mask);
> - return __send_ipi_mask(&mask, vector);
> + if (!hv_hypercall_pg || (vector < HV_IPI_LOW_VECTOR) ||
> + (vector > HV_IPI_HIGH_VECTOR))
> + return false;
> +
> + if (cpu >= 64)
> + return __send_ipi_mask_ex(cpumask_of(cpu), vector);
The above test should be checking the VP number, not the CPU
number, since the VP number is used to form the bitmap argument
to the hypercall. In all current implementations of Hyper-V, the CPU number
and VP number are the same as far as I am aware, but that's not guaranteed in
the future.
Michael
> +
> + return !hv_do_fast_hypercall16(HVCALL_SEND_IPI, vector,
> + BIT_ULL(hv_cpu_number_to_vp_number(cpu)));
> }
>
Powered by blists - more mailing lists