[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1560994438-235698-1-git-send-email-fenghua.yu@intel.com>
Date: Wed, 19 Jun 2019 18:33:53 -0700
From: Fenghua Yu <fenghua.yu@...el.com>
To: "Thomas Gleixner" <tglx@...utronix.de>,
"Ingo Molnar" <mingo@...hat.com>, "Borislav Petkov" <bp@...en8.de>,
"H Peter Anvin" <hpa@...or.com>,
"Andy Lutomirski" <luto@...nel.org>,
"Peter Zijlstra" <peterz@...radead.org>,
"Ashok Raj" <ashok.raj@...el.com>,
"Tony Luck" <tony.luck@...el.com>,
"Ravi V Shankar" <ravi.v.shankar@...el.com>
Cc: "linux-kernel" <linux-kernel@...r.kernel.org>,
"x86" <x86@...nel.org>, Fenghua Yu <fenghua.yu@...el.com>
Subject: [PATCH v5 0/5] x86/umwait: Enable user wait instructions
Today, if an application needs to wait for a very short duration
they have to have spinloops. Spinloops consume more power and continue
to use execution resources that could hurt its thread siblings in a core
with hyperthreads. New instructions umonitor, umwait and tpause allow
a low power alternative waiting at the same time could improve the HT
sibling perform while giving it any power headroom. These instructions
can be used in both user space and kernel space.
A new MSR IA32_UMWAIT_CONTROL allows kernel to set a time limit in
TSC-quanta that prevents user applications from waiting for a long time.
This allows applications to yield the CPU and the user application
should consider using other alternatives to wait.
A quote from Andy Lutomirski on setting the time limit:
"What I want to avoid is the case where it works dramatically
differently on NO_HZ_FULL systems as compared to everything else.
Also, UMWAIT may behave a bit differently if the max timeout is hit,
and I'd like that path to get exercised widely by making it happen
even on default configs.
So I propose setting the timeout to either 100 microseconds or 100k
"cycles" by default."
The processor supports two levels of optimized states: a light-weight
power/performance optimized state (C0.1 state) or an improved
power/performance optimized state (C0.2 state with deeper power saving
and higher exit latency). The above MSR can be used to restrict
entry to C0.2 and then any request for C0.2 will revert to C0.1.
This patch set covers feature discovery, provides initial values for
the MSR, adds some sysfs control files for admin to tweak the values
in the MSR if needed.
The sysfs interface files are in /sys/devices/system/cpu/umwait_control/
GCC 9 enables intrinsics for the instructions. To use the instructions,
user applications should include <immintrin.h> and be compiled with
-mwaitpkg.
Detailed information on the instructions, the MSR, and syntax of the
intrinsics can be found in the latest Intel Architecture Instruction
Set Extensions and Future Features Programming Reference and Intel 64
and IA-32 Architectures Software Developer's Manual.
Changelog:
v5:
- Change locking from mutex to disabling irq before wrmsr per
Andy Lutomirski's comment
- Add macro UMWAIT_CTRL_VAL to explicitly disable C0.2 per
Thomas Gleixner's comment
- Move umwait.c to arch/x86/kernel/cpu/ per Peter Zijlstra's comment
- Add justification of max time 100k per Peter Zijlstra's comment
v4:
- Error out when bit[1:0] in IA32_UMWAIT_CONTROL is not zero per
Andy Lutomirski's comment.
- Use umwait_control_cached to cache IA32_UMWAIT_CONTROL MSR. This
variable replaces the two previous variables umwait_max_time and
umwait_c0_2_enabled. The code is simpler than before and the cached MSR
will be easier to be used in future KVM support.
v3:
Address issues pointed out by Andy Lutomirski:
- Change default umwait max time to 100k TSC cycles
- Setting up MSR on BSP during resume suspend/hibernation
- A few other naming and coding changes as suggested
- Some security concerns of the user wait instructions are not issues
of the patches and cannot be addressed in the patch set. They will be
discussed on lkml.
Plus:
- Add ABI document entry for umwait control sysfs interfaces
v2:
- Address comments from Thomas Gleixner and Andy Lutomirski
- Remove vDSO functions
- Add sysfs control file for umwait max time
v1:
Based on comments from Thomas:
- Change user APIs to vDSO functions
- Changed sysfs per comments from Thomas.
- Change patch descriptions etc
Fenghua Yu (5):
x86/cpufeatures: Enumerate user wait instructions
x86/umwait: Initialize umwait control values
x86/umwait: Add sysfs interface to control umwait C0.2 state
x86/umwait: Add sysfs interface to control umwait maximum time
x86/umwait: Document umwait control sysfs interfaces
.../ABI/testing/sysfs-devices-system-cpu | 21 ++
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 4 +
arch/x86/kernel/cpu/Makefile | 1 +
arch/x86/kernel/cpu/umwait.c | 205 ++++++++++++++++++
5 files changed, 232 insertions(+)
create mode 100644 arch/x86/kernel/cpu/umwait.c
--
2.19.1
Powered by blists - more mailing lists