[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220324173927.2230447-2-bigeasy@linutronix.de>
Date: Thu, 24 Mar 2022 18:39:25 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: torvalds@...ux-foundation.org
Cc: akpm@...ux-foundation.org, bigeasy@...utronix.de,
boqun.feng@...il.com, bp@...en8.de, linux-kernel@...r.kernel.org,
longman@...hat.com, mingo@...nel.org, peterz@...radead.org,
tglx@...utronix.de, will@...nel.org,
Dennis Zhou <dennis@...nel.org>, Tejun Heo <tj@...nel.org>,
Christoph Lameter <cl@...ux.com>
Subject: [PATCH 1/3] x86/percpu: Remove volatile from arch_raw_cpu_ptr().
The volatile attribute in the inline assembly of arch_raw_cpu_ptr()
forces the compiler to always generate the code, even if the compiler
can decide upfront that its result is not needed.
For instance invoking __intel_pmu_disable_all(false) (like
intel_pmu_snapshot_arch_branch_stack() does) leads to loading the
address of &cpu_hw_events into the register while compiler knows that it
has no need for it. This ends up with code like:
| movq $cpu_hw_events, %rax #, tcp_ptr__
| add %gs:this_cpu_off(%rip), %rax # this_cpu_off, tcp_ptr__
| xorl %eax, %eax # tmp93
It also creates additional code within local_lock() with !RT &&
!LOCKDEP which is not desired.
By removing the volatile attribute the compiler can place the
function freely and avoid it if it is not needed in the end.
By using the function twice the compiler properly caches only the
variable offset and always loads the CPU-offset.
Remove volatile from arch_raw_cpu_ptr().
Suggested-by: Linus Torvalds <torvalds@...ux-foundation.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
---
arch/x86/include/asm/percpu.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index a3c33b79fb865..5d572a19a389c 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -38,7 +38,7 @@
#define arch_raw_cpu_ptr(ptr) \
({ \
unsigned long tcp_ptr__; \
- asm volatile("add " __percpu_arg(1) ", %0" \
+ asm ("add " __percpu_arg(1) ", %0" \
: "=r" (tcp_ptr__) \
: "m" (this_cpu_off), "0" (ptr)); \
(typeof(*(ptr)) __kernel __force *)tcp_ptr__; \
--
2.35.1
Powered by blists - more mailing lists