lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <TY3PR01MB111481E9B0AF263ACC8EA5D4AE5BA2@TY3PR01MB11148.jpnprd01.prod.outlook.com>
Date: Fri, 9 Aug 2024 06:02:50 +0000
From: "Tomohiro Misono (Fujitsu)" <misono.tomohiro@...itsu.com>
To: 'Ankur Arora' <ankur.a.arora@...cle.com>, "linux-pm@...r.kernel.org"
	<linux-pm@...r.kernel.org>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
	"linux-arm-kernel@...ts.infradead.org"
	<linux-arm-kernel@...ts.infradead.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>
CC: "catalin.marinas@....com" <catalin.marinas@....com>, "will@...nel.org"
	<will@...nel.org>, "tglx@...utronix.de" <tglx@...utronix.de>,
	"mingo@...hat.com" <mingo@...hat.com>, "bp@...en8.de" <bp@...en8.de>,
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, "x86@...nel.org"
	<x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>, "pbonzini@...hat.com"
	<pbonzini@...hat.com>, "wanpengli@...cent.com" <wanpengli@...cent.com>,
	"vkuznets@...hat.com" <vkuznets@...hat.com>, "rafael@...nel.org"
	<rafael@...nel.org>, "daniel.lezcano@...aro.org" <daniel.lezcano@...aro.org>,
	"peterz@...radead.org" <peterz@...radead.org>, "arnd@...db.de"
	<arnd@...db.de>, "lenb@...nel.org" <lenb@...nel.org>, "mark.rutland@....com"
	<mark.rutland@....com>, "harisokn@...zon.com" <harisokn@...zon.com>,
	"mtosatti@...hat.com" <mtosatti@...hat.com>, "sudeep.holla@....com"
	<sudeep.holla@....com>, "cl@...two.org" <cl@...two.org>,
	"joao.m.martins@...cle.com" <joao.m.martins@...cle.com>,
	"boris.ostrovsky@...cle.com" <boris.ostrovsky@...cle.com>,
	"konrad.wilk@...cle.com" <konrad.wilk@...cle.com>
Subject: RE: [PATCH v6 00/10] Enable haltpoll on arm64

> Subject: [PATCH v6 00/10] Enable haltpoll on arm64
> 
> This patchset enables the cpuidle-haltpoll driver and its namesake
> governor on arm64. This is specifically interesting for KVM guests by
> reducing IPC latencies.
> 
> Comparing idle switching latencies on an arm64 KVM guest with
> perf bench sched pipe:
> 
>                                      usecs/op       %stdev
> 
>   no haltpoll (baseline)               13.48       +-  5.19%
>   with haltpoll                         6.84       +- 22.07%

I got similar results with VM on Grace machine (applied to 6.10).

[default]
# cat /sys/devices/system/cpu/cpuidle/current_driver
none
# perf bench sched pipe
# Running 'sched/pipe' benchmark:
# Executed 1000000 pipe operations between two processes

     Total time: 23.832 [sec]

      23.832644 usecs/op
          41959 ops/sec

[With "cpuidle-haltpoll.force=1" commandline]
# cat /sys/devices/system/cpu/cpuidle/current_driver
haltpoll
# perf bench sched pipe
# Running 'sched/pipe' benchmark:
# Executed 1000000 pipe operations between two processes

     Total time: 6.340 [sec]

       6.340116 usecs/op
         157725 ops/sec

Tested-by: Misono Tomohiro <misono.tomohiro@...itsu.com>
Regards,
Tomohiro


> 
> 
> No change in performance for a similar test on x86:
> 
>                                      usecs/op        %stdev
> 
>   haltpoll w/ cpu_relax() (baseline)     4.75      +-  1.76%
>   haltpoll w/ smp_cond_load_relaxed()    4.78      +-  2.31%
> 
> Both sets of tests were on otherwise idle systems with guest VCPUs
> pinned to specific PCPUs. One reason for the higher stdev on arm64
> is that trapping of the WFE instruction by the host KVM is contingent
> on the number of tasks on the runqueue.
> 
> 
> The patch series is organized in three parts:
> 
>  - patch 1, reorganizes the poll_idle() loop, switching to
>    smp_cond_load_relaxed() in the polling loop.
>    Relatedly patches 2, 3 mangle the config option ARCH_HAS_CPU_RELAX,
>    renaming it to ARCH_HAS_OPTIMIZED_POLL.
> 
>  - patches 4-6 reorganize the haltpoll selection and init logic
>    to allow architecture code to select it.
> 
>  - and finally, patches 7-10 add the bits for arm64 support.
> 
> 
> What is still missing: this series largely completes the haltpoll side
> of functionality for arm64. There are, however, a few related areas
> that still need to be threshed out:
> 
>  - WFET support: WFE on arm64 does not guarantee that poll_idle()
>    would terminate in halt_poll_ns. Using WFET would address this.
>  - KVM_NO_POLL support on arm64
>  - KVM TWED support on arm64: allow the host to limit time spent in
>    WFE.
> 
> 
> Changelog:
> 
> v6:
> 
>  - reordered the patches to keep poll_idle() and ARCH_HAS_OPTIMIZED_POLL
>    changes together (comment from Christoph Lameter)
>  - threshes out the commit messages a bit more (comments from Christoph
>    Lameter, Sudeep Holla)
>  - also rework selection of cpuidle-haltpoll. Now selected based
>    on the architectural selection of ARCH_CPUIDLE_HALTPOLL.
>  - moved back to arch_haltpoll_want() (comment from Joao Martins)
>    Also, arch_haltpoll_want() now takes the force parameter and is
>    now responsible for the complete selection (or not) of haltpoll.
>  - fixes the build breakage on i386
>  - fixes the cpuidle-haltpoll module breakage on arm64 (comment from
>    Tomohiro Misono, Haris Okanovic)
> 
> 
> v5:
>  - rework the poll_idle() loop around smp_cond_load_relaxed() (review
>    comment from Tomohiro Misono.)
>  - also rework selection of cpuidle-haltpoll. Now selected based
>    on the architectural selection of ARCH_CPUIDLE_HALTPOLL.
>  - arch_haltpoll_supported() (renamed from arch_haltpoll_want()) on
>    arm64 now depends on the event-stream being enabled.
>  - limit POLL_IDLE_RELAX_COUNT on arm64 (review comment from Haris Okanovic)
>  - ARCH_HAS_CPU_RELAX is now renamed to ARCH_HAS_OPTIMIZED_POLL.
> 
> v4 changes from v3:
>  - change 7/8 per Rafael input: drop the parens and use ret for the final check
>  - add 8/8 which renames the guard for building poll_state
> 
> v3 changes from v2:
>  - fix 1/7 per Petr Mladek - remove ARCH_HAS_CPU_RELAX from arch/x86/Kconfig
>  - add Ack-by from Rafael Wysocki on 2/7
> 
> v2 changes from v1:
>  - added patch 7 where we change cpu_relax with smp_cond_load_relaxed per PeterZ
>    (this improves by 50% at least the CPU cycles consumed in the tests above:
>    10,716,881,137 now vs 14,503,014,257 before)
>  - removed the ifdef from patch 1 per RafaelW
> 
> Please review.
> 
> Ankur Arora (5):
>   cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
>   cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL
>   arm64: idle: export arch_cpu_idle
>   arm64: support cpuidle-haltpoll
>   cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64
> 
> Joao Martins (4):
>   Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig
>   cpuidle-haltpoll: define arch_haltpoll_want()
>   governors/haltpoll: drop kvm_para_available() check
>   arm64: define TIF_POLLING_NRFLAG
> 
> Mihai Carabas (1):
>   cpuidle/poll_state: poll via smp_cond_load_relaxed()
> 
>  arch/Kconfig                              |  3 +++
>  arch/arm64/Kconfig                        | 10 ++++++++++
>  arch/arm64/include/asm/cpuidle_haltpoll.h |  9 +++++++++
>  arch/arm64/include/asm/thread_info.h      |  2 ++
>  arch/arm64/kernel/cpuidle.c               | 23 +++++++++++++++++++++++
>  arch/arm64/kernel/idle.c                  |  1 +
>  arch/x86/Kconfig                          |  5 ++---
>  arch/x86/include/asm/cpuidle_haltpoll.h   |  1 +
>  arch/x86/kernel/kvm.c                     | 13 +++++++++++++
>  drivers/acpi/processor_idle.c             |  4 ++--
>  drivers/cpuidle/Kconfig                   |  5 ++---
>  drivers/cpuidle/Makefile                  |  2 +-
>  drivers/cpuidle/cpuidle-haltpoll.c        | 12 +-----------
>  drivers/cpuidle/governors/haltpoll.c      |  6 +-----
>  drivers/cpuidle/poll_state.c              | 21 ++++++++++++++++-----
>  drivers/idle/Kconfig                      |  1 +
>  include/linux/cpuidle.h                   |  2 +-
>  include/linux/cpuidle_haltpoll.h          |  5 +++++
>  18 files changed, 94 insertions(+), 31 deletions(-)
>  create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
> 
> --
> 2.43.5


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ