[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <tip-9d4646d14d51d62b967a12452c30ea7edf8dd8fa@git.kernel.org>
Date: Fri, 27 Apr 2018 02:42:36 -0700
From: tip-bot for Will Deacon <tipbot@...or.com>
To: linux-tip-commits@...r.kernel.org
Cc: will.deacon@....com, torvalds@...ux-foundation.org,
linux-kernel@...r.kernel.org, peterz@...radead.org,
mingo@...nel.org, tglx@...utronix.de, hpa@...or.com,
longman@...hat.com
Subject: [tip:locking/core] locking/qspinlock: Elide back-to-back RELEASE
operations with smp_wmb()
Commit-ID: 9d4646d14d51d62b967a12452c30ea7edf8dd8fa
Gitweb: https://git.kernel.org/tip/9d4646d14d51d62b967a12452c30ea7edf8dd8fa
Author: Will Deacon <will.deacon@....com>
AuthorDate: Thu, 26 Apr 2018 11:34:25 +0100
Committer: Ingo Molnar <mingo@...nel.org>
CommitDate: Fri, 27 Apr 2018 09:48:52 +0200
locking/qspinlock: Elide back-to-back RELEASE operations with smp_wmb()
The qspinlock slowpath must ensure that the MCS node is fully initialised
before it can be reached by another other CPU. This is currently enforced
by using a RELEASE operation when updating the tail and also when linking
the node into the waitqueue, since the control dependency off xchg_tail
is insufficient to enforce sufficient ordering, see:
95bcade33a8a ("locking/qspinlock: Ensure node is initialised before updating prev->next")
Back-to-back RELEASE operations may be expensive on some architectures,
particularly those that implement them using fences under the hood. We
can replace the two RELEASE operations with a single smp_wmb() fence and
use RELAXED operations for the subsequent publishing of the node.
Signed-off-by: Will Deacon <will.deacon@....com>
Acked-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Acked-by: Waiman Long <longman@...hat.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: boqun.feng@...il.com
Cc: linux-arm-kernel@...ts.infradead.org
Cc: paulmck@...ux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1524738868-31318-12-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@...nel.org>
---
kernel/locking/qspinlock.c | 33 +++++++++++++++++----------------
1 file changed, 17 insertions(+), 16 deletions(-)
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index d6c3b029bd93..956a12983bd0 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -164,10 +164,10 @@ static __always_inline void clear_pending_set_locked(struct qspinlock *lock)
static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail)
{
/*
- * Use release semantics to make sure that the MCS node is properly
- * initialized before changing the tail code.
+ * We can use relaxed semantics since the caller ensures that the
+ * MCS node is properly initialized before updating the tail.
*/
- return (u32)xchg_release(&lock->tail,
+ return (u32)xchg_relaxed(&lock->tail,
tail >> _Q_TAIL_OFFSET) << _Q_TAIL_OFFSET;
}
@@ -212,10 +212,11 @@ static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail)
for (;;) {
new = (val & _Q_LOCKED_PENDING_MASK) | tail;
/*
- * Use release semantics to make sure that the MCS node is
- * properly initialized before changing the tail code.
+ * We can use relaxed semantics since the caller ensures that
+ * the MCS node is properly initialized before updating the
+ * tail.
*/
- old = atomic_cmpxchg_release(&lock->val, val, new);
+ old = atomic_cmpxchg_relaxed(&lock->val, val, new);
if (old == val)
break;
@@ -388,12 +389,18 @@ queue:
goto release;
/*
+ * Ensure that the initialisation of @node is complete before we
+ * publish the updated tail via xchg_tail() and potentially link
+ * @node into the waitqueue via WRITE_ONCE(prev->next, node) below.
+ */
+ smp_wmb();
+
+ /*
+ * Publish the updated tail.
* We have already touched the queueing cacheline; don't bother with
* pending stuff.
*
* p,*,* -> n,*,*
- *
- * RELEASE, such that the stores to @node must be complete.
*/
old = xchg_tail(lock, tail);
next = NULL;
@@ -405,14 +412,8 @@ queue:
if (old & _Q_TAIL_MASK) {
prev = decode_tail(old);
- /*
- * We must ensure that the stores to @node are observed before
- * the write to prev->next. The address dependency from
- * xchg_tail is not sufficient to ensure this because the read
- * component of xchg_tail is unordered with respect to the
- * initialisation of @node.
- */
- smp_store_release(&prev->next, node);
+ /* Link @node into the waitqueue. */
+ WRITE_ONCE(prev->next, node);
pv_wait_node(node, prev);
arch_mcs_spin_lock_contended(&node->locked);
Powered by blists - more mailing lists