[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160412045827.GA18437@linux-uzut.site>
Date: Mon, 11 Apr 2016 21:58:27 -0700
From: Davidlohr Bueso <dave@...olabs.net>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, will.deacon@....com,
waiman.long@....com, mingo@...hat.com, paulmck@...ux.vnet.ibm.com,
boqun.feng@...il.com, torvalds@...ux-foundation.org
Subject: Re: [RFC][PATCH 2/3] locking/qrwlock: Use smp_cond_load_acquire()
On Mon, 04 Apr 2016, Peter Zijlstra wrote:
>Use smp_cond_load_acquire() to make better use of the hardware
>assisted 'spin' wait on arm64.
>
>Arguably the second hunk is the more horrid abuse possible, but
>avoids having to use cmpwait (see next patch) directly. Also, this
>makes 'clever' (ab)use of the cond+rmb acquire to omit the acquire
>from cmpxchg().
>
>Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
>---
> kernel/locking/qrwlock.c | 18 ++++--------------
> 1 file changed, 4 insertions(+), 14 deletions(-)
>
>--- a/kernel/locking/qrwlock.c
>+++ b/kernel/locking/qrwlock.c
>@@ -53,10 +53,7 @@ struct __qrwlock {
> static __always_inline void
> rspin_until_writer_unlock(struct qrwlock *lock, u32 cnts)
> {
>- while ((cnts & _QW_WMASK) == _QW_LOCKED) {
>- cpu_relax_lowlatency();
>- cnts = atomic_read_acquire(&lock->cnts);
>- }
>+ smp_cond_load_acquire(&lock->cnts.counter, (VAL & _QW_WMASK) != _QW_LOCKED);
> }
>
> /**
>@@ -109,8 +106,6 @@ EXPORT_SYMBOL(queued_read_lock_slowpath)
> */
> void queued_write_lock_slowpath(struct qrwlock *lock)
> {
>- u32 cnts;
>-
> /* Put the writer into the wait queue */
> arch_spin_lock(&lock->wait_lock);
>
>@@ -134,15 +129,10 @@ void queued_write_lock_slowpath(struct q
> }
>
> /* When no more readers, set the locked flag */
>- for (;;) {
>- cnts = atomic_read(&lock->cnts);
>- if ((cnts == _QW_WAITING) &&
>- (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING,
>- _QW_LOCKED) == _QW_WAITING))
>- break;
>+ smp_cond_load_acquire(&lock->cnts.counter,
>+ (VAL == _QW_WAITING) &&
>+ atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, _QW_LOCKED) == _QW_WAITING);
>
>- cpu_relax_lowlatency();
You would need some variant for cpu_relax_lowlatency otherwise you'll be hurting s390, no?
fwiw back when I was looking at this, I recall thinking about possibly introducing
smp_cond_acquire_lowlatency but never got around to it.
Thanks,
Davidlohr
Powered by blists - more mailing lists