lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 05 Apr 2022 08:28:56 -0000
From:   "tip-bot2 for Sebastian Andrzej Siewior" <tip-bot2@...utronix.de>
To:     linux-tip-commits@...r.kernel.org
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: [tip: locking/urgent] x86/percpu: Remove volatile from arch_raw_cpu_ptr().

The following commit has been merged into the locking/urgent branch of tip:

Commit-ID:     1c1e7e3c23dd25f938302428eeb22c3dda2c3427
Gitweb:        https://git.kernel.org/tip/1c1e7e3c23dd25f938302428eeb22c3dda2c3427
Author:        Sebastian Andrzej Siewior <bigeasy@...utronix.de>
AuthorDate:    Mon, 28 Mar 2022 16:58:08 +02:00
Committer:     Peter Zijlstra <peterz@...radead.org>
CommitterDate: Tue, 05 Apr 2022 09:59:38 +02:00

x86/percpu: Remove volatile from arch_raw_cpu_ptr().

The volatile attribute in the inline assembly of arch_raw_cpu_ptr()
forces the compiler to always generate the code, even if the compiler
can decide upfront that its result is not needed.

For instance invoking __intel_pmu_disable_all(false) (like
intel_pmu_snapshot_arch_branch_stack() does) leads to loading the
address of &cpu_hw_events into the register while compiler knows that it
has no need for it. This ends up with code like:

|	movq	$cpu_hw_events, %rax			#, tcp_ptr__
|	add	%gs:this_cpu_off(%rip), %rax		# this_cpu_off, tcp_ptr__
|	xorl	%eax, %eax				# tmp93

It also creates additional code within local_lock() with !RT &&
!LOCKDEP which is not desired.

By removing the volatile attribute the compiler can place the
function freely and avoid it if it is not needed in the end.
By using the function twice the compiler properly caches only the
variable offset and always loads the CPU-offset.

this_cpu_ptr() also remains properly placed within a preempt_disable()
sections because
- arch_raw_cpu_ptr() assembly has a memory input ("m" (this_cpu_off))
- prempt_{dis,en}able() fundamentally has a 'barrier()' in it

Therefore this_cpu_ptr() is already properly serialized and does not
rely on the 'volatile' attribute.

Remove volatile from arch_raw_cpu_ptr().

[ bigeasy: Added Linus' explanation why this_cpu_ptr() is not moved out
  of a preempt_disable() section without the 'volatile' attribute. ]

Suggested-by: Linus Torvalds <torvalds@...ux-foundation.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Link: https://lore.kernel.org/r/20220328145810.86783-2-bigeasy@linutronix.de
---
 arch/x86/include/asm/percpu.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index a3c33b7..13c0d63 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -38,9 +38,9 @@
 #define arch_raw_cpu_ptr(ptr)				\
 ({							\
 	unsigned long tcp_ptr__;			\
-	asm volatile("add " __percpu_arg(1) ", %0"	\
-		     : "=r" (tcp_ptr__)			\
-		     : "m" (this_cpu_off), "0" (ptr));	\
+	asm ("add " __percpu_arg(1) ", %0"		\
+	     : "=r" (tcp_ptr__)				\
+	     : "m" (this_cpu_off), "0" (ptr));		\
 	(typeof(*(ptr)) __kernel __force *)tcp_ptr__;	\
 })
 #else

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ