[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220515184205.103089-1-ubizjak@gmail.com>
Date: Sun, 15 May 2022 20:42:02 +0200
From: Uros Bizjak <ubizjak@...il.com>
To: x86@...nel.org, linux-kernel@...r.kernel.org
Cc: Uros Bizjak <ubizjak@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>, Will Deacon <will@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Boqun Feng <boqun.feng@...il.com>,
Mark Rutland <mark.rutland@....com>,
"Paul E. McKenney" <paulmck@...nel.org>,
Marco Elver <elver@...gle.com>
Subject: [PATCH v3 0/2] locking/atomic/x86: Introduce arch_try_cmpxchg64
The followign patchset introduces generic support for try_cmpxchg64
and introduces arch_try_cmpxchg64 for 64-bit and 32-bit targets.
On 64-bit targets, the generated assembly improves from:
ab: 89 c8 mov %ecx,%eax
ad: 48 89 4c 24 60 mov %rcx,0x60(%rsp)
b2: 83 e0 fd and $0xfffffffd,%eax
b5: 89 54 24 64 mov %edx,0x64(%rsp)
b9: 88 44 24 60 mov %al,0x60(%rsp)
bd: 48 89 c8 mov %rcx,%rax
c0: c6 44 24 62 f2 movb $0xf2,0x62(%rsp)
c5: 48 8b 74 24 60 mov 0x60(%rsp),%rsi
ca: f0 49 0f b1 34 24 lock cmpxchg %rsi,(%r12)
d0: 48 39 c1 cmp %rax,%rcx
d3: 75 cf jne a4 <t+0xa4>
to:
b3: 89 c2 mov %eax,%edx
b5: 48 89 44 24 60 mov %rax,0x60(%rsp)
ba: 83 e2 fd and $0xfffffffd,%edx
bd: 89 4c 24 64 mov %ecx,0x64(%rsp)
c1: 88 54 24 60 mov %dl,0x60(%rsp)
c5: c6 44 24 62 f2 movb $0xf2,0x62(%rsp)
ca: 48 8b 54 24 60 mov 0x60(%rsp),%rdx
cf: f0 48 0f b1 13 lock cmpxchg %rdx,(%rbx)
d4: 75 d5 jne ab <t+0xab>
where a move and a compare after cmpxchg is saved. The improvements
for 32-bit targets are even more noticeable, because dual-word compare
after cmpxchg8b gets eliminated.
Changes since v2:
* Remove invalid and unnecessary cast from 32-bit arch_try_cmpxchg64.
Changes since v1:
* Implement full support for try_cmpxchg64{,_acquire,_release,_relaxed}
and their falbacks involving cmpxchg64.
* Split patch to generic and target-dependant part.
* Modernize __try_cmpxchg64 asm template with symbolic operand name.
* Use generic fallback when arch_try_cmpxchg64 is not defined.
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Borislav Petkov <bp@...en8.de>
Cc: Dave Hansen <dave.hansen@...ux.intel.com>
Cc: "H. Peter Anvin" <hpa@...or.com>
Cc: Will Deacon <will@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Boqun Feng <boqun.feng@...il.com>
Cc: Mark Rutland <mark.rutland@....com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Marco Elver <elver@...gle.com>
Uros Bizjak (2):
locking/atomic: Add generic try_cmpxchg64 support
locking/atomic/x86: Introduce arch_try_cmpxchg64
arch/x86/include/asm/cmpxchg_32.h | 21 ++++++
arch/x86/include/asm/cmpxchg_64.h | 6 ++
include/linux/atomic/atomic-arch-fallback.h | 72 ++++++++++++++++++++-
include/linux/atomic/atomic-instrumented.h | 40 +++++++++++-
scripts/atomic/gen-atomic-fallback.sh | 31 +++++----
scripts/atomic/gen-atomic-instrumented.sh | 2 +-
6 files changed, 156 insertions(+), 16 deletions(-)
--
2.35.1
Powered by blists - more mailing lists