lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue,  4 Aug 2020 19:10:59 -0700
From:   Ricardo Neri <>
To:     Thomas Gleixner <>,
        Ingo Molnar <>, Borislav Petkov <>,
        Andy Lutomirski <>,,
        "Peter Zijlstra (Intel)" <>
Cc:     Dave Hansen <>,
        Tony Luck <>,
        Cathy Zhang <>,
        Fenghua Yu <>,
        "H. Peter Anvin" <>,
        Kyung Min Park <>,
        "Ravi V. Shankar" <>,
        Sean Christopherson <>,,
        Ricardo Neri <>,
        Ricardo Neri <>,
        Dave Hansen <>,
Subject: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available

The SERIALIZE instruction gives software a way to force the processor to
complete all modifications to flags, registers and memory from previous
instructions and drain all buffered writes to memory before the next
instruction is fetched and executed. Thus, it serves the purpose of
sync_core(). Use it when available.

Commit 7117f16bf460 ("objtool: Fix ORC vs alternatives") enforced stack
invariance in alternatives. The iret-to-self does not comply with such
invariance. Thus, it cannot be used inside alternative code. Instead, use
an alternative that jumps to SERIALIZE when available.

Cc: Andy Lutomirski <>
Cc: Cathy Zhang <>
Cc: Dave Hansen <>
Cc: Fenghua Yu <>
Cc: "H. Peter Anvin" <>
Cc: Kyung Min Park <>
Cc: Peter Zijlstra <>
Cc: "Ravi V. Shankar" <>
Cc: Sean Christopherson <>
Suggested-by: Andy Lutomirski <>
Signed-off-by: Ricardo Neri <>
This is a v2 from my initial submission [1]. The first three patches of
the series have been merged in Linus' tree. Hence, I am submitting only
this patch for review.


Changes since v1:
 * Support SERIALIZE using alternative runtime patching.
   (Peter Zijlstra, H. Peter Anvin)
 * Added a note to specify which version of binutils supports SERIALIZE.
   (Peter Zijlstra)
 * Verified that (::: "memory") is used. (H. Peter Anvin)
 arch/x86/include/asm/special_insns.h |  2 ++
 arch/x86/include/asm/sync_core.h     | 10 +++++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 59a3e13204c3..25cd67801dda 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -10,6 +10,8 @@
 #include <linux/irqflags.h>
 #include <linux/jump_label.h>
+/* Instruction opcode for SERIALIZE; supported in binutils >= 2.35. */
+#define __ASM_SERIALIZE ".byte 0xf, 0x1, 0xe8"
  * Volatile isn't enough to prevent the compiler from reordering the
  * read/write functions for the control registers and messing everything up.
diff --git a/arch/x86/include/asm/sync_core.h b/arch/x86/include/asm/sync_core.h
index fdb5b356e59b..201ea3d9a6bd 100644
--- a/arch/x86/include/asm/sync_core.h
+++ b/arch/x86/include/asm/sync_core.h
@@ -5,15 +5,19 @@
 #include <linux/preempt.h>
 #include <asm/processor.h>
 #include <asm/cpufeature.h>
+#include <asm/special_insns.h>
 #ifdef CONFIG_X86_32
 static inline void iret_to_self(void)
 	asm volatile (
 		"pushl %%cs\n\t"
 		"pushl $1f\n\t"
+		"2:\n\t"
 		: ASM_CALL_CONSTRAINT : : "memory");
@@ -23,6 +27,7 @@ static inline void iret_to_self(void)
 	unsigned int tmp;
 	asm volatile (
 		"mov %%ss, %0\n\t"
 		"pushq %q0\n\t"
 		"pushq %%rsp\n\t"
@@ -32,6 +37,8 @@ static inline void iret_to_self(void)
 		"pushq %q0\n\t"
 		"pushq $1f\n\t"
+		"2:\n\t"
 		: "=&r" (tmp), ASM_CALL_CONSTRAINT : : "cc", "memory");
@@ -54,7 +61,8 @@ static inline void iret_to_self(void)
 static inline void sync_core(void)
-	 * There are quite a few ways to do this.  IRET-to-self is nice
+	 * Hardware can do this for us if SERIALIZE is available. Otherwise,
+	 * there are quite a few ways to do this.  IRET-to-self is nice
 	 * because it works on every CPU, at any CPL (so it's compatible
 	 * with paravirtualization), and it never exits to a hypervisor.
 	 * The only down sides are that it's a bit slow (it seems to be

Powered by blists - more mailing lists