lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160412045827.GA18437@linux-uzut.site>
Date:	Mon, 11 Apr 2016 21:58:27 -0700
From:	Davidlohr Bueso <dave@...olabs.net>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org, will.deacon@....com,
	waiman.long@....com, mingo@...hat.com, paulmck@...ux.vnet.ibm.com,
	boqun.feng@...il.com, torvalds@...ux-foundation.org
Subject: Re: [RFC][PATCH 2/3] locking/qrwlock: Use smp_cond_load_acquire()

On Mon, 04 Apr 2016, Peter Zijlstra wrote:

>Use smp_cond_load_acquire() to make better use of the hardware
>assisted 'spin' wait on arm64.
>
>Arguably the second hunk is the more horrid abuse possible, but
>avoids having to use cmpwait (see next patch) directly. Also, this
>makes 'clever' (ab)use of the cond+rmb acquire to omit the acquire
>from cmpxchg().
>
>Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
>---
> kernel/locking/qrwlock.c |   18 ++++--------------
> 1 file changed, 4 insertions(+), 14 deletions(-)
>
>--- a/kernel/locking/qrwlock.c
>+++ b/kernel/locking/qrwlock.c
>@@ -53,10 +53,7 @@ struct __qrwlock {
> static __always_inline void
> rspin_until_writer_unlock(struct qrwlock *lock, u32 cnts)
> {
>-	while ((cnts & _QW_WMASK) == _QW_LOCKED) {
>-		cpu_relax_lowlatency();
>-		cnts = atomic_read_acquire(&lock->cnts);
>-	}
>+	smp_cond_load_acquire(&lock->cnts.counter, (VAL & _QW_WMASK) != _QW_LOCKED);
> }
>
> /**
>@@ -109,8 +106,6 @@ EXPORT_SYMBOL(queued_read_lock_slowpath)
>  */
> void queued_write_lock_slowpath(struct qrwlock *lock)
> {
>-	u32 cnts;
>-
> 	/* Put the writer into the wait queue */
> 	arch_spin_lock(&lock->wait_lock);
>
>@@ -134,15 +129,10 @@ void queued_write_lock_slowpath(struct q
> 	}
>
> 	/* When no more readers, set the locked flag */
>-	for (;;) {
>-		cnts = atomic_read(&lock->cnts);
>-		if ((cnts == _QW_WAITING) &&
>-		    (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING,
>-					    _QW_LOCKED) == _QW_WAITING))
>-			break;
>+	smp_cond_load_acquire(&lock->cnts.counter,
>+		(VAL == _QW_WAITING) &&
>+		atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, _QW_LOCKED) == _QW_WAITING);
>
>-		cpu_relax_lowlatency();

You would need some variant for cpu_relax_lowlatency otherwise you'll be hurting s390, no?
fwiw back when I was looking at this, I recall thinking about possibly introducing
smp_cond_acquire_lowlatency but never got around to it.

Thanks,
Davidlohr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ