lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <142c0f46-7408-e5ce-713c-34aa80f80960@amd.com>
Date:   Thu, 19 May 2022 15:43:42 +0530
From:   Wyes Karny <wyes.karny@....com>
To:     linux-kernel@...r.kernel.org
Cc:     Lewis.Carroll@....com, Mario.Limonciello@....com,
        gautham.shenoy@....com, Ananth.Narayan@....com, bharata@....com,
        len.brown@...el.com, x86@...nel.org, tglx@...utronix.de,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        hpa@...or.com, peterz@...radead.org, chang.seok.bae@...el.com,
        keescook@...omium.org, metze@...ba.org, zhengqi.arch@...edance.com,
        mark.rutland@....com, puwen@...on.cn, rafael.j.wysocki@...el.com,
        andrew.cooper3@...rix.com, jing2.liu@...el.com,
        jmattson@...gle.com, pawan.kumar.gupta@...ux.intel.com
Subject: Re: [PATCH v3 0/3] x86: Prefer MWAIT over HLT on AMD processors

Hello Dave,

Is there any feedback for this patchset?


On 5/10/2022 3:48 PM, Wyes Karny wrote:
> This is a version 3 of the patchset to "Prefer MWAIT over HLT on AMD
> processors"
> 
> The previous versions are
> v2: https://lore.kernel.org/lkml/20220505104856.452311-1-wyes.karny@amd.com/
> v1: https://lore.kernel.org/lkml/20220405130021.557880-1-wyes.karny@amd.com/
> 
> The change between v2 --> v3 is
> - Update some text in commit messages
> - Update the documentation around idle=nomwait
> - Remove unnecessary CPUID level check from prefer_mwait_c1_over_halt function
> 
> Background
> ==========
> 
> Currently in the absence of the cpuidle driver (eg: when global C-States are
> disabled in the BIOS or when cpuidle is driver is not compiled in), the default
> idle state on AMD Zen processors uses the HLT instruction even though there is
> support for MWAIT instruction which is more efficient than HLT.
> 
> HPC customers who want to optimize for lower latency are known to disable
> Global C-States in the BIOS. Some vendors allow choosing a BIOS 'performance'
> profile which explicitly disables C-States. In this scenario, the cpuidle
> driver will not be loaded and the kernel will continue with the default idle
> state chosen at boot time. On AMD systems currently the default idle state is
> HLT which has a higher exit latency compared to MWAIT.
> 
> The reason for the choice of HLT over MWAIT on AMD systems is:
> 
> 1. Families prior to 10h didn't support MWAIT
> 2. Families 10h-15h supported MWAIT, but not MWAIT C1. Hence it was
>    preferable to use HLT as the default state on these systems.
> 
> However, AMD Family 17h onwards supports MWAIT as well as MWAIT C1. And it is
> preferable to use MWAIT as the default idle state on these systems, as it has
> lower exit latencies.
> 
> The below table represents the exit latency for HLT and MWAIT on AMD Zen 3
> system. Exit latency is measured by issuing a wakeup (IPI) to other CPU and
> measuring how many clock cycles it took to wakeup.  Each iteration measures 10K
> wakeups by pinning source and destination.
> 
> HLT:
> 
> 25.0000th percentile  :      1900 ns
> 50.0000th percentile  :      2000 ns
> 75.0000th percentile  :      2300 ns
> 90.0000th percentile  :      2500 ns
> 95.0000th percentile  :      2600 ns
> 99.0000th percentile  :      2800 ns
> 99.5000th percentile  :      3000 ns
> 99.9000th percentile  :      3400 ns
> 99.9500th percentile  :      3600 ns
> 99.9900th percentile  :      5900 ns
>   Min latency         :      1700 ns
>   Max latency         :      5900 ns
> Total Samples      9999
> 
> MWAIT:
> 
> 25.0000th percentile  :      1400 ns
> 50.0000th percentile  :      1500 ns
> 75.0000th percentile  :      1700 ns
> 90.0000th percentile  :      1800 ns
> 95.0000th percentile  :      1900 ns
> 99.0000th percentile  :      2300 ns
> 99.5000th percentile  :      2500 ns
> 99.9000th percentile  :      3200 ns
> 99.9500th percentile  :      3500 ns
> 99.9900th percentile  :      4600 ns
>   Min latency         :      1200 ns
>   Max latency         :      4600 ns
> Total Samples      9997
> 
> Improvement (99th percentile): 21.74%
> 
> Below is another result for context_switch2 micro-benchmark, which brings out
> the impact of improved wakeup latency through increased context-switches per
> second.
> 
> Link: https://ozlabs.org/~anton/junkcode/context_switch2.c
> 
> with HLT:
> -------------------------------
> 50.0000th percentile  :  190184
> 75.0000th percentile  :  191032
> 90.0000th percentile  :  192314
> 95.0000th percentile  :  192520
> 99.0000th percentile  :  192844
> MIN  :  190148
> MAX  :  192852
> 
> with MWAIT:
> -------------------------------
> 50.0000th percentile  :  277444
> 75.0000th percentile  :  278268
> 90.0000th percentile  :  278888
> 95.0000th percentile  :  279164
> 99.0000th percentile  :  280504
> MIN  :  273278
> MAX  :  281410
> 
> Improvement(99th percentile): ~ 45.46%
> 
> A similar trend is observed on older Zen processors also.
> 
> Here we enable MWAIT instruction as the default idle call for AMD Zen
> processors which support MWAIT. We retain the existing behaviour for older
> processors which depend on HLT.
> 
> This patchset restores the decision tree that was present in the kernel earlier
> due to Thomas Gleixner's patch: commit 09fd4b4ef5bc ("x86: use cpuid to check
> MWAIT support for C1")
> 
> NOTE: This change only impacts the default idle behaviour in the absence of
> cpuidle driver. If the cpuidle driver is present, it controls the processor
> idle behaviour.
> 
> Fixes: commit b253149b843f ("sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power savings and to improve performance")
> 
> Changelog:
> v3:
> - Update documentation around idle=nomwait
> - Remove unnecessary CPUID check from prefer_mwait_c1_over_halt function
> v2:
> - Remove vendor checks, fix idle=nomwait condition, fix documentation
> 
> Wyes Karny (3):
>   x86: Use HLT in default_idle when idle=nomwait cmdline arg is passed
>   x86: Remove vendor checks from prefer_mwait_c1_over_halt
>   x86: Fix comment for X86_FEATURE_ZEN
> 
>  Documentation/admin-guide/pm/cpuidle.rst | 15 ++++++----
>  arch/x86/include/asm/cpufeatures.h       |  2 +-
>  arch/x86/include/asm/mwait.h             |  1 +-
>  arch/x86/kernel/process.c                | 39 ++++++++++++++++++-------
>  4 files changed, 40 insertions(+), 17 deletions(-)
> 
> base-commit: d70522fc541224b8351ac26f4765f2c6268f8d72

Thanks,
Wyes

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ