lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 13 Jan 2016 22:12:44 +0200 From: "Michael S. Tsirkin" <mst@...hat.com> To: linux-kernel@...r.kernel.org, Linus Torvalds <torvalds@...ux-foundation.org> Cc: Davidlohr Bueso <dave@...olabs.net>, Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>, the arch/x86 maintainers <x86@...nel.org>, Davidlohr Bueso <dbueso@...e.de>, "H. Peter Anvin" <hpa@...or.com>, virtualization <virtualization@...ts.linux-foundation.org>, Borislav Petkov <bp@...en8.de>, Andy Lutomirski <luto@...capital.net>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...e.de>, Arnd Bergmann <arnd@...db.de>, Andrey Konovalov <andreyknvl@...gle.com>, Andy Lutomirski <luto@...nel.org> Subject: [PATCH v3 4/4] x86: drop mfence in favor of lock+addl mfence appears to be way slower than a locked instruction - let's use lock+add unconditionally, as we always did on old 32-bit. Just poking at SP would be the most natural, but if we then read the value from SP, we get a false dependency which will slow us down. This was noted in this article: http://shipilev.net/blog/2014/on-the-fence-with-dependencies/ And is easy to reproduce by sticking a barrier in a small non-inline function. So let's use a negative offset - which avoids this problem since we build with the red zone disabled. Update rmb/wmb on 32 bit to use the negative offset, too, for consistency. Suggested-by: Andy Lutomirski <luto@...capital.net> Signed-off-by: Michael S. Tsirkin <mst@...hat.com> --- arch/x86/include/asm/barrier.h | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h index bfb28ca..9a2d257 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -11,16 +11,15 @@ */ #ifdef CONFIG_X86_32 -#define mb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "mfence", \ - X86_FEATURE_XMM2) ::: "memory", "cc") -#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "lfence", \ +#define mb() asm volatile("lock; addl $0,-4(%%esp)" ::: "memory", "cc") +#define rmb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "lfence", \ X86_FEATURE_XMM2) ::: "memory", "cc") -#define wmb() asm volatile(ALTERNATIVE("lock; addl $0,0(%%esp)", "sfence", \ +#define wmb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "sfence", \ X86_FEATURE_XMM2) ::: "memory", "cc") #else -#define mb() asm volatile("mfence":::"memory") -#define rmb() asm volatile("lfence":::"memory") -#define wmb() asm volatile("sfence" ::: "memory") +#define mb() asm volatile("lock; addl $0,-4(%%rsp)" ::: "memory", "cc") +#define rmb() asm volatile("lfence" ::: "memory") +#define wmb() asm volatile("sfence" ::: "memory") #endif #ifdef CONFIG_X86_PPRO_FENCE -- MST
Powered by blists - more mailing lists