[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <163156195512.25758.5915554914005716758.tip-bot2@tip-bot2>
Date: Mon, 13 Sep 2021 19:39:15 -0000
From: "tip-bot2 for H. Peter Anvin" <tip-bot2@...utronix.de>
To: linux-tip-commits@...r.kernel.org
Cc: "H. Peter Anvin (Intel)" <hpa@...or.com>,
Borislav Petkov <bp@...e.de>, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: [tip: x86/cpu] x86/asm: Avoid adding register pressure for the init
case in static_cpu_has()
The following commit has been merged into the x86/cpu branch of tip:
Commit-ID: 0507503671f9b1c867e889cbec0f43abf904f23c
Gitweb: https://git.kernel.org/tip/0507503671f9b1c867e889cbec0f43abf904f23c
Author: H. Peter Anvin <hpa@...or.com>
AuthorDate: Fri, 10 Sep 2021 12:59:10 -07:00
Committer: Borislav Petkov <bp@...e.de>
CommitterDate: Mon, 13 Sep 2021 19:48:21 +02:00
x86/asm: Avoid adding register pressure for the init case in static_cpu_has()
gcc will sometimes manifest the address of boot_cpu_data in a register
as part of constant propagation. When multiple static_cpu_has() are used
this may foul the mainline code with a register load which will only be
used on the fallback path, which is unused after initialization.
Explicitly force gcc to use immediate (rip-relative) addressing for
the fallback path, thus removing any possible register use from
static_cpu_has().
While making changes, modernize the code to use
.pushsection...popsection instead of .section...previous.
Signed-off-by: H. Peter Anvin (Intel) <hpa@...or.com>
Signed-off-by: Borislav Petkov <bp@...e.de>
Link: https://lkml.kernel.org/r/20210910195910.2542662-4-hpa@zytor.com
---
arch/x86/include/asm/cpufeature.h | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 16a51e7..1261842 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -173,20 +173,25 @@ extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
* means that the boot_cpu_has() variant is already fast enough for the
* majority of cases and you should stick to using it as it is generally
* only two instructions: a RIP-relative MOV and a TEST.
+ *
+ * Do not use an "m" constraint for [cap_byte] here: gcc doesn't know
+ * that this is only used on a fallback path and will sometimes cause
+ * it to manifest the address of boot_cpu_data in a register, fouling
+ * the mainline (post-initialization) code.
*/
static __always_inline bool _static_cpu_has(u16 bit)
{
asm_volatile_goto(
ALTERNATIVE_TERNARY("jmp 6f", %P[feature], "", "jmp %l[t_no]")
- ".section .altinstr_aux,\"ax\"\n"
+ ".pushsection .altinstr_aux,\"ax\"\n"
"6:\n"
- " testb %[bitnum],%[cap_byte]\n"
+ " testb %[bitnum]," _ASM_RIP(%P[cap_byte]) "\n"
" jnz %l[t_yes]\n"
" jmp %l[t_no]\n"
- ".previous\n"
+ ".popsection\n"
: : [feature] "i" (bit),
[bitnum] "i" (1 << (bit & 7)),
- [cap_byte] "m" (((const char *)boot_cpu_data.x86_capability)[bit >> 3])
+ [cap_byte] "i" (&((const char *)boot_cpu_data.x86_capability)[bit >> 3])
: : t_yes, t_no);
t_yes:
return true;
Powered by blists - more mailing lists