[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230305205628.27385-1-ubizjak@gmail.com>
Date: Sun, 5 Mar 2023 21:56:18 +0100
From: Uros Bizjak <ubizjak@...il.com>
To: linux-alpha@...r.kernel.org, linux-kernel@...r.kernel.org,
loongarch@...ts.linux.dev, linux-mips@...r.kernel.org,
linuxppc-dev@...ts.ozlabs.org, linux-arch@...r.kernel.org,
linux-perf-users@...r.kernel.org
Cc: Uros Bizjak <ubizjak@...il.com>,
Richard Henderson <richard.henderson@...aro.org>,
Ivan Kokshaysky <ink@...assic.park.msu.ru>,
Matt Turner <mattst88@...il.com>,
Huacai Chen <chenhuacai@...nel.org>,
WANG Xuerui <kernel@...0n.name>,
Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
Michael Ellerman <mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, Arnd Bergmann <arnd@...db.de>,
Peter Zijlstra <peterz@...radead.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>, Will Deacon <will@...nel.org>,
Boqun Feng <boqun.feng@...il.com>,
Jiaxun Yang <jiaxun.yang@...goat.com>,
Jun Yi <yijun@...ngson.cn>
Subject: [PATCH 00/10] locking: Introduce local{,64}_try_cmpxchg
Add generic and target specific support for local{,64}_try_cmpxchg
and wire up support for all targets that use local_t infrastructure.
The patch enables x86 targets to emit special instruction for
local_try_cmpxchg and also local64_try_cmpxchg for x86_64.
The last patch changes __perf_output_begin in events/ring_buffer
to use new locking primitive and improves code from
4b3: 48 8b 82 e8 00 00 00 mov 0xe8(%rdx),%rax
4ba: 48 8b b8 08 04 00 00 mov 0x408(%rax),%rdi
4c1: 8b 42 1c mov 0x1c(%rdx),%eax
4c4: 48 8b 4a 28 mov 0x28(%rdx),%rcx
4c8: 85 c0 test %eax,%eax
...
4ef: 48 89 c8 mov %rcx,%rax
4f2: 48 0f b1 7a 28 cmpxchg %rdi,0x28(%rdx)
4f7: 48 39 c1 cmp %rax,%rcx
4fa: 75 b7 jne 4b3 <...>
to
4b2: 48 8b 4a 28 mov 0x28(%rdx),%rcx
4b6: 48 8b 82 e8 00 00 00 mov 0xe8(%rdx),%rax
4bd: 48 8b b0 08 04 00 00 mov 0x408(%rax),%rsi
4c4: 8b 42 1c mov 0x1c(%rdx),%eax
4c7: 85 c0 test %eax,%eax
...
4d4: 48 89 c8 mov %rcx,%rax
4d7: 48 0f b1 72 28 cmpxchg %rsi,0x28(%rdx)
4dc: 0f 85 d0 00 00 00 jne 5b2 <...>
...
5b2: 48 89 c1 mov %rax,%rcx
5b5: e9 fc fe ff ff jmp 4b6 <...>
Please note that in addition to removed compare, the load from
0x28(%rdx) gets moved out of the loop and the code is rearranged
according to likely/unlikely tags in the source.
Cc: Richard Henderson <richard.henderson@...aro.org>
Cc: Ivan Kokshaysky <ink@...assic.park.msu.ru>
Cc: Matt Turner <mattst88@...il.com>
Cc: Huacai Chen <chenhuacai@...nel.org>
Cc: WANG Xuerui <kernel@...0n.name>
Cc: Thomas Bogendoerfer <tsbogend@...ha.franken.de>
Cc: Michael Ellerman <mpe@...erman.id.au>
Cc: Nicholas Piggin <npiggin@...il.com>
Cc: Christophe Leroy <christophe.leroy@...roup.eu>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Borislav Petkov <bp@...en8.de>
Cc: Dave Hansen <dave.hansen@...ux.intel.com>
Cc: x86@...nel.org
Cc: "H. Peter Anvin" <hpa@...or.com>
Cc: Arnd Bergmann <arnd@...db.de>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: Mark Rutland <mark.rutland@....com>
Cc: Alexander Shishkin <alexander.shishkin@...ux.intel.com>
Cc: Jiri Olsa <jolsa@...nel.org>
Cc: Namhyung Kim <namhyung@...nel.org>
Cc: Ian Rogers <irogers@...gle.com>
Cc: Will Deacon <will@...nel.org>
Cc: Boqun Feng <boqun.feng@...il.com>
Cc: Jiaxun Yang <jiaxun.yang@...goat.com>
Cc: Jun Yi <yijun@...ngson.cn>
Uros Bizjak (10):
locking/atomic: Add missing cast to try_cmpxchg() fallbacks
locking/atomic: Add generic try_cmpxchg{,64}_local support
locking/alpha: Wire up local_try_cmpxchg
locking/loongarch: Wire up local_try_cmpxchg
locking/mips: Wire up local_try_cmpxchg
locking/powerpc: Wire up local_try_cmpxchg
locking/x86: Wire up local_try_cmpxchg
locking/generic: Wire up local{,64}_try_cmpxchg
locking/x86: Enable local{,64}_try_cmpxchg
perf/ring_buffer: use local_try_cmpxchg in __perf_output_begin
arch/alpha/include/asm/local.h | 2 ++
arch/loongarch/include/asm/local.h | 2 ++
arch/mips/include/asm/local.h | 2 ++
arch/powerpc/include/asm/local.h | 11 ++++++
arch/x86/include/asm/cmpxchg.h | 6 ++++
arch/x86/include/asm/local.h | 2 ++
include/asm-generic/local.h | 1 +
include/asm-generic/local64.h | 2 ++
include/linux/atomic/atomic-arch-fallback.h | 40 ++++++++++++++++-----
include/linux/atomic/atomic-instrumented.h | 20 ++++++++++-
kernel/events/ring_buffer.c | 5 +--
scripts/atomic/gen-atomic-fallback.sh | 6 +++-
scripts/atomic/gen-atomic-instrumented.sh | 2 +-
13 files changed, 87 insertions(+), 14 deletions(-)
--
2.39.2
Powered by blists - more mailing lists