lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 16 Oct 2023 11:09:12 -0000
From:   "tip-bot2 for Uros Bizjak" <tip-bot2@...utronix.de>
To:     linux-tip-commits@...r.kernel.org
Cc:     Uros Bizjak <ubizjak@...il.com>, Ingo Molnar <mingo@...nel.org>,
        Andy Lutomirski <luto@...nel.org>,
        Brian Gerst <brgerst@...il.com>,
        Denys Vlasenko <dvlasenk@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: [tip: x86/percpu] x86/percpu: Rewrite arch_raw_cpu_ptr() to be easier
 for compilers to optimize

The following commit has been merged into the x86/percpu branch of tip:

Commit-ID:     a048d3abae7c33f0a3f4575fab15ac5504d443f7
Gitweb:        https://git.kernel.org/tip/a048d3abae7c33f0a3f4575fab15ac5504d443f7
Author:        Uros Bizjak <ubizjak@...il.com>
AuthorDate:    Sun, 15 Oct 2023 22:24:39 +02:00
Committer:     Ingo Molnar <mingo@...nel.org>
CommitterDate: Mon, 16 Oct 2023 12:51:58 +02:00

x86/percpu: Rewrite arch_raw_cpu_ptr() to be easier for compilers to optimize

Implement arch_raw_cpu_ptr() as a load from this_cpu_off and then
add the ptr value to the base. This way, the compiler can propagate
addend to the following instruction and simplify address calculation.

E.g.: address calcuation in amd_pmu_enable_virt() improves from:

    48 c7 c0 00 00 00 00	mov    $0x0,%rax
	87b7: R_X86_64_32S	cpu_hw_events

    65 48 03 05 00 00 00	add    %gs:0x0(%rip),%rax
    00
	87bf: R_X86_64_PC32	this_cpu_off-0x4

    48 c7 80 28 13 00 00	movq   $0x0,0x1328(%rax)
    00 00 00 00

to:

    65 48 8b 05 00 00 00	mov    %gs:0x0(%rip),%rax
    00
	8798: R_X86_64_PC32	this_cpu_off-0x4
    48 c7 80 00 00 00 00	movq   $0x0,0x0(%rax)
    00 00 00 00
	87a6: R_X86_64_32S	cpu_hw_events+0x1328

The compiler also eliminates additional redundant loads from
this_cpu_off, reducing the number of percpu offset reads
from 1668 to 1646 on a test build, a -1.3% reduction.

Signed-off-by: Uros Bizjak <ubizjak@...il.com>
Signed-off-by: Ingo Molnar <mingo@...nel.org>
Cc: Andy Lutomirski <luto@...nel.org>
Cc: Brian Gerst <brgerst@...il.com>
Cc: Denys Vlasenko <dvlasenk@...hat.com>
Cc: H. Peter Anvin <hpa@...or.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Josh Poimboeuf <jpoimboe@...hat.com>
Cc: Uros Bizjak <ubizjak@...il.com>
Cc: Sean Christopherson <seanjc@...gle.com>
Link: https://lore.kernel.org/r/20231015202523.189168-1-ubizjak@gmail.com
---
 arch/x86/include/asm/percpu.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 60ea775..915675f 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -56,9 +56,11 @@
 #define arch_raw_cpu_ptr(ptr)					\
 ({								\
 	unsigned long tcp_ptr__;				\
-	asm ("add " __percpu_arg(1) ", %0"			\
+	asm ("mov " __percpu_arg(1) ", %0"			\
 	     : "=r" (tcp_ptr__)					\
-	     : "m" (__my_cpu_var(this_cpu_off)), "0" (ptr));	\
+	     : "m" (__my_cpu_var(this_cpu_off)));		\
+								\
+	tcp_ptr__ += (unsigned long)(ptr);			\
 	(typeof(*(ptr)) __kernel __force *)tcp_ptr__;		\
 })
 #else /* CONFIG_SMP */

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ