[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250402172458.1378112-1-andrew.cooper3@citrix.com>
Date: Wed, 2 Apr 2025 18:24:58 +0100
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: LKML <linux-kernel@...r.kernel.org>
Cc: Andrew Cooper <andrew.cooper3@...rix.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>
Subject: [PATCH v2] x86/idle: Remove MFENCEs for X86_BUG_CLFLUSH_MONITOR
Commit 7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH
workaround, add barriers") adds barriers, justified with:
... and add memory barriers around it since the documentation is explicit
that CLFLUSH is only ordered with respect to MFENCE.
This also triggered the same adjustment in commit
f8e617f45829 ("sched/idle/x86: Optimize unnecessary mwait_idle() resched
IPIs") during development, although it failed to get the static_cpu_has_bug()
treatment.
X86_BUG_CLFLUSH_MONITOR (a.k.a the AAI65 errata) is specific to Intel CPUs,
and the SDM currently states:
Executions of the CLFLUSH instruction are ordered with respect to each
other and with respect to writes, locked read-modify-write instructions,
and fence instructions[1].
With footnote 1 reading:
Earlier versions of this manual specified that executions of the CLFLUSH
instruction were ordered only by the MFENCE instruction. All processors
implementing the CLFLUSH instruction also order it relative to the other
operations enumerated above.
i.e. The SDM was incorrect at the time, and barriers should not have been
inserted. Double checking the original AAI65 errata (not available from
intel.com any more) shows no mention of barriers either.
Note: If this were a general codepath, the MFENCEs would be needed, because
AMD CPUs of the same vintage do sport otherwise-unordered CLFLUSHs.
Furthermore, use a plain alternative, rather than static_cpu_has_bug() and/or
no optimisation. The workaround is a single instruction.
Use an explicit %rax pointer rather than a general memory operand, because
MONITOR takes the pointer implicitly in the same way.
Link: https://web.archive.org/web/20090219054841/http://download.intel.com/design/xeon/specupdt/32033601.pdf
Fixes: 7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers")
Fixes: f8e617f45829 ("sched/idle/x86: Optimize unnecessary mwait_idle() resched IPIs")
Signed-off-by: Andrew Cooper <andrew.cooper3@...rix.com>
---
CC: Thomas Gleixner <tglx@...utronix.de>
CC: Ingo Molnar <mingo@...hat.com>
CC: Borislav Petkov <bp@...en8.de>
CC: Dave Hansen <dave.hansen@...ux.intel.com>
CC: x86@...nel.org
CC: "H. Peter Anvin" <hpa@...or.com>
CC: linux-kernel@...r.kernel.org
v2:
* Fix the same pattern in mwait_idle() too
* Expand on why we're not using a general memory operand.
---
arch/x86/include/asm/mwait.h | 11 +++++------
arch/x86/kernel/process.c | 10 ++++------
2 files changed, 9 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index ce857ef54cf1..108d25bd4597 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -116,13 +116,12 @@ static __always_inline void __sti_mwait(unsigned long eax, unsigned long ecx)
static __always_inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
{
if (static_cpu_has_bug(X86_BUG_MONITOR) || !current_set_polling_and_test()) {
- if (static_cpu_has_bug(X86_BUG_CLFLUSH_MONITOR)) {
- mb();
- clflush((void *)¤t_thread_info()->flags);
- mb();
- }
- __monitor((void *)¤t_thread_info()->flags, 0, 0);
+ const void *addr = ¤t_thread_info()->flags;
+
+ alternative_input("", "clflush (%[addr])", X86_BUG_CLFLUSH_MONITOR,
+ [addr] "a" (addr));
+ __monitor(addr, 0, 0);
if (!need_resched()) {
if (ecx & 1) {
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 91f6ff618852..a1a41b3d94d2 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -907,13 +907,11 @@ static __init bool prefer_mwait_c1_over_halt(void)
static __cpuidle void mwait_idle(void)
{
if (!current_set_polling_and_test()) {
- if (this_cpu_has(X86_BUG_CLFLUSH_MONITOR)) {
- mb(); /* quirk */
- clflush((void *)¤t_thread_info()->flags);
- mb(); /* quirk */
- }
+ const void *addr = ¤t_thread_info()->flags;
- __monitor((void *)¤t_thread_info()->flags, 0, 0);
+ alternative_input("", "clflush (%[addr])", X86_BUG_CLFLUSH_MONITOR,
+ [addr] "a" (addr));
+ __monitor(addr, 0, 0);
if (!need_resched()) {
__sti_mwait(0, 0);
raw_local_irq_disable();
base-commit: acc4d5ff0b61eb1715c498b6536c38c1feb7f3c1
--
2.39.5
Powered by blists - more mailing lists