lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 26 Jul 2020 22:55:15 -0700
From:   hpa@...or.com
To:     Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...e.de>,
        Andy Lutomirski <luto@...nel.org>, x86@...nel.org
CC:     Dave Hansen <dave.hansen@...el.com>,
        Tony Luck <tony.luck@...el.com>,
        Cathy Zhang <cathy.zhang@...el.com>,
        Fenghua Yu <fenghua.yu@...el.com>,
        Kyung Min Park <kyung.min.park@...el.com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        "Ravi V. Shankar" <ravi.v.shankar@...el.com>,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        linux-kernel@...r.kernel.org,
        Ricardo Neri <ricardo.neri@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        linux-edac@...r.kernel.org
Subject: Re: [PATCH 4/4] x86/cpu: Use SERIALIZE in sync_core() when available

On July 26, 2020 9:31:32 PM PDT, Ricardo Neri <ricardo.neri-calderon@...ux.intel.com> wrote:
>The SERIALIZE instruction gives software a way to force the processor
>to
>complete all modifications to flags, registers and memory from previous
>instructions and drain all buffered writes to memory before the next
>instruction is fetched and executed. Thus, it serves the purpose of
>sync_core(). Use it when available.
>
>Use boot_cpu_has() and not static_cpu_has(); the most critical paths
>(returning to user mode and from interrupt and NMI) will not reach
>sync_core().
>
>Cc: Andy Lutomirski <luto@...nel.org>
>Cc: Cathy Zhang <cathy.zhang@...el.com>
>Cc: Dave Hansen <dave.hansen@...ux.intel.com>
>Cc: Fenghua Yu <fenghua.yu@...el.com>
>Cc: "H. Peter Anvin" <hpa@...or.com>
>Cc: Kyung Min Park <kyung.min.park@...el.com>
>Cc: Peter Zijlstra <peterz@...radead.org>
>Cc: "Ravi V. Shankar" <ravi.v.shankar@...el.com>
>Cc: Sean Christopherson <sean.j.christopherson@...el.com>
>Cc: linux-edac@...r.kernel.org
>Cc: linux-kernel@...r.kernel.org
>Reviwed-by: Tony Luck <tony.luck@...el.com>
>Suggested-by: Andy Lutomirski <luto@...nel.org>
>Signed-off-by: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
>---
>---
> arch/x86/include/asm/special_insns.h |  5 +++++
> arch/x86/include/asm/sync_core.h     | 10 +++++++++-
> 2 files changed, 14 insertions(+), 1 deletion(-)
>
>diff --git a/arch/x86/include/asm/special_insns.h
>b/arch/x86/include/asm/special_insns.h
>index 59a3e13204c3..0a2a60bba282 100644
>--- a/arch/x86/include/asm/special_insns.h
>+++ b/arch/x86/include/asm/special_insns.h
>@@ -234,6 +234,11 @@ static inline void clwb(volatile void *__p)
> 
> #define nop() asm volatile ("nop")
> 
>+static inline void serialize(void)
>+{
>+	asm volatile(".byte 0xf, 0x1, 0xe8");
>+}
>+
> #endif /* __KERNEL__ */
> 
> #endif /* _ASM_X86_SPECIAL_INSNS_H */
>diff --git a/arch/x86/include/asm/sync_core.h
>b/arch/x86/include/asm/sync_core.h
>index fdb5b356e59b..bf132c09d61b 100644
>--- a/arch/x86/include/asm/sync_core.h
>+++ b/arch/x86/include/asm/sync_core.h
>@@ -5,6 +5,7 @@
> #include <linux/preempt.h>
> #include <asm/processor.h>
> #include <asm/cpufeature.h>
>+#include <asm/special_insns.h>
> 
> #ifdef CONFIG_X86_32
> static inline void iret_to_self(void)
>@@ -54,7 +55,8 @@ static inline void iret_to_self(void)
> static inline void sync_core(void)
> {
> 	/*
>-	 * There are quite a few ways to do this.  IRET-to-self is nice
>+	 * Hardware can do this for us if SERIALIZE is available. Otherwise,
>+	 * there are quite a few ways to do this.  IRET-to-self is nice
> 	 * because it works on every CPU, at any CPL (so it's compatible
> 	 * with paravirtualization), and it never exits to a hypervisor.
> 	 * The only down sides are that it's a bit slow (it seems to be
>@@ -75,6 +77,12 @@ static inline void sync_core(void)
> 	 * Like all of Linux's memory ordering operations, this is a
> 	 * compiler barrier as well.
> 	 */
>+
>+	if (boot_cpu_has(X86_FEATURE_SERIALIZE)) {
>+		serialize();
>+		return;
>+	}
>+
> 	iret_to_self();
> }
> 

Any reason to not make sync_core() an inline with alternatives?

For a really overenginered solution, but which might perform unnecessary poorly on existing hardware:

asm volatile("1: .byte 0xf, 0x1, 0xe8; 2:"
                        _ASM_EXTABLE(1b,2b));

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ