[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <EBDB5889-4FAC-45FC-A2B1-285751721592@zytor.com>
Date: Sun, 26 Jul 2020 23:00:33 -0700
From: hpa@...or.com
To: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...e.de>,
Andy Lutomirski <luto@...nel.org>, x86@...nel.org
CC: Dave Hansen <dave.hansen@...el.com>,
Tony Luck <tony.luck@...el.com>,
Cathy Zhang <cathy.zhang@...el.com>,
Fenghua Yu <fenghua.yu@...el.com>,
Kyung Min Park <kyung.min.park@...el.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
"Ravi V. Shankar" <ravi.v.shankar@...el.com>,
Sean Christopherson <sean.j.christopherson@...el.com>,
linux-kernel@...r.kernel.org,
Ricardo Neri <ricardo.neri@...el.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
linux-edac@...r.kernel.org
Subject: Re: [PATCH 4/4] x86/cpu: Use SERIALIZE in sync_core() when available
On July 26, 2020 10:55:15 PM PDT, hpa@...or.com wrote:
>On July 26, 2020 9:31:32 PM PDT, Ricardo Neri
><ricardo.neri-calderon@...ux.intel.com> wrote:
>>The SERIALIZE instruction gives software a way to force the processor
>>to
>>complete all modifications to flags, registers and memory from
>previous
>>instructions and drain all buffered writes to memory before the next
>>instruction is fetched and executed. Thus, it serves the purpose of
>>sync_core(). Use it when available.
>>
>>Use boot_cpu_has() and not static_cpu_has(); the most critical paths
>>(returning to user mode and from interrupt and NMI) will not reach
>>sync_core().
>>
>>Cc: Andy Lutomirski <luto@...nel.org>
>>Cc: Cathy Zhang <cathy.zhang@...el.com>
>>Cc: Dave Hansen <dave.hansen@...ux.intel.com>
>>Cc: Fenghua Yu <fenghua.yu@...el.com>
>>Cc: "H. Peter Anvin" <hpa@...or.com>
>>Cc: Kyung Min Park <kyung.min.park@...el.com>
>>Cc: Peter Zijlstra <peterz@...radead.org>
>>Cc: "Ravi V. Shankar" <ravi.v.shankar@...el.com>
>>Cc: Sean Christopherson <sean.j.christopherson@...el.com>
>>Cc: linux-edac@...r.kernel.org
>>Cc: linux-kernel@...r.kernel.org
>>Reviwed-by: Tony Luck <tony.luck@...el.com>
>>Suggested-by: Andy Lutomirski <luto@...nel.org>
>>Signed-off-by: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
>>---
>>---
>> arch/x86/include/asm/special_insns.h | 5 +++++
>> arch/x86/include/asm/sync_core.h | 10 +++++++++-
>> 2 files changed, 14 insertions(+), 1 deletion(-)
>>
>>diff --git a/arch/x86/include/asm/special_insns.h
>>b/arch/x86/include/asm/special_insns.h
>>index 59a3e13204c3..0a2a60bba282 100644
>>--- a/arch/x86/include/asm/special_insns.h
>>+++ b/arch/x86/include/asm/special_insns.h
>>@@ -234,6 +234,11 @@ static inline void clwb(volatile void *__p)
>>
>> #define nop() asm volatile ("nop")
>>
>>+static inline void serialize(void)
>>+{
>>+ asm volatile(".byte 0xf, 0x1, 0xe8");
>>+}
>>+
>> #endif /* __KERNEL__ */
>>
>> #endif /* _ASM_X86_SPECIAL_INSNS_H */
>>diff --git a/arch/x86/include/asm/sync_core.h
>>b/arch/x86/include/asm/sync_core.h
>>index fdb5b356e59b..bf132c09d61b 100644
>>--- a/arch/x86/include/asm/sync_core.h
>>+++ b/arch/x86/include/asm/sync_core.h
>>@@ -5,6 +5,7 @@
>> #include <linux/preempt.h>
>> #include <asm/processor.h>
>> #include <asm/cpufeature.h>
>>+#include <asm/special_insns.h>
>>
>> #ifdef CONFIG_X86_32
>> static inline void iret_to_self(void)
>>@@ -54,7 +55,8 @@ static inline void iret_to_self(void)
>> static inline void sync_core(void)
>> {
>> /*
>>- * There are quite a few ways to do this. IRET-to-self is nice
>>+ * Hardware can do this for us if SERIALIZE is available. Otherwise,
>>+ * there are quite a few ways to do this. IRET-to-self is nice
>> * because it works on every CPU, at any CPL (so it's compatible
>> * with paravirtualization), and it never exits to a hypervisor.
>> * The only down sides are that it's a bit slow (it seems to be
>>@@ -75,6 +77,12 @@ static inline void sync_core(void)
>> * Like all of Linux's memory ordering operations, this is a
>> * compiler barrier as well.
>> */
>>+
>>+ if (boot_cpu_has(X86_FEATURE_SERIALIZE)) {
>>+ serialize();
>>+ return;
>>+ }
>>+
>> iret_to_self();
>> }
>>
>
>Any reason to not make sync_core() an inline with alternatives?
>
>For a really overenginered solution, but which might perform
>unnecessary poorly on existing hardware:
>
>asm volatile("1: .byte 0xf, 0x1, 0xe8; 2:"
> _ASM_EXTABLE(1b,2b));
(and : : : "memory" of course.)
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Powered by blists - more mailing lists