lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 12 Apr 2024 11:42:34 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org, linux-tip-commits@...r.kernel.org,
	Uros Bizjak <ubizjak@...il.com>, Waiman Long <longman@...hat.com>,
	x86@...nel.org
Subject: [PATCH v2] locking/pvqspinlock: Use try_cmpxchg_acquire() in
 trylock_clear_pending()


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Thu, 11 Apr 2024 at 06:33, tip-bot2 for Uros Bizjak
> <tip-bot2@...utronix.de> wrote:
> >
> > Use try_cmpxchg_acquire(*ptr, &old, new) instead of
> > cmpxchg_relaxed(*ptr, old, new) == old in trylock_clear_pending().
> 
> The above commit message is horribly confusing and wrong.
> 
> I was going "that's not right", because it says "use acquire instead
> of relaxed" memory ordering, and then goes on to say "No functional
> change intended".
> 
> But it turns out the *code* was always acquire, and it's only the
> commit message that is wrong, presumably due to a bit too much
> cut-and-paste.

Yeah, the replacement is cmpxchg_acquire() => try_cmpxchg_acquire(), with 
no change to memory ordering.

> But please fix the commit message, and use the right memory ordering
> in the explanations too.

Done, find below the new patch, which a hopefully better commit message.

I also added your Reviewed-by tag optimistically :)

Thanks,

	Ingo

===========================>
From: Uros Bizjak <ubizjak@...il.com>
Date: Mon, 25 Mar 2024 15:09:32 +0100
Subject: [PATCH] locking/pvqspinlock: Use try_cmpxchg_acquire() in trylock_clear_pending()

Replace this pattern in trylock_clear_pending():

    cmpxchg_acquire(*ptr, old, new) == old

.. with the simpler and faster:

    try_cmpxchg_acquire(*ptr, &old, new)

The x86 CMPXCHG instruction returns success in the ZF flag, so this change
saves a compare after the CMPXCHG.

Also change the return type of the function to bool and streamline
the control flow in the _Q_PENDING_BITS == 8 variant a bit.

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@...il.com>
Signed-off-by: Ingo Molnar <mingo@...nel.org>
Reviewed-by: Waiman Long <longman@...hat.com>
Reviewed-by: Linus Torvalds <torvalds@...ux-foundation.org>
Link: https://lore.kernel.org/r/20240325140943.815051-1-ubizjak@gmail.com
---
 kernel/locking/qspinlock_paravirt.h | 31 +++++++++++++------------------
 1 file changed, 13 insertions(+), 18 deletions(-)

diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
index 169950fe1aad..77ba80bd95f9 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -116,11 +116,12 @@ static __always_inline void set_pending(struct qspinlock *lock)
  * barrier. Therefore, an atomic cmpxchg_acquire() is used to acquire the
  * lock just to be sure that it will get it.
  */
-static __always_inline int trylock_clear_pending(struct qspinlock *lock)
+static __always_inline bool trylock_clear_pending(struct qspinlock *lock)
 {
+	u16 old = _Q_PENDING_VAL;
+
 	return !READ_ONCE(lock->locked) &&
-	       (cmpxchg_acquire(&lock->locked_pending, _Q_PENDING_VAL,
-				_Q_LOCKED_VAL) == _Q_PENDING_VAL);
+	       try_cmpxchg_acquire(&lock->locked_pending, &old, _Q_LOCKED_VAL);
 }
 #else /* _Q_PENDING_BITS == 8 */
 static __always_inline void set_pending(struct qspinlock *lock)
@@ -128,27 +129,21 @@ static __always_inline void set_pending(struct qspinlock *lock)
 	atomic_or(_Q_PENDING_VAL, &lock->val);
 }
 
-static __always_inline int trylock_clear_pending(struct qspinlock *lock)
+static __always_inline bool trylock_clear_pending(struct qspinlock *lock)
 {
-	int val = atomic_read(&lock->val);
-
-	for (;;) {
-		int old, new;
-
-		if (val  & _Q_LOCKED_MASK)
-			break;
+	int old, new;
 
+	old = atomic_read(&lock->val);
+	do {
+		if (old & _Q_LOCKED_MASK)
+			return false;
 		/*
 		 * Try to clear pending bit & set locked bit
 		 */
-		old = val;
-		new = (val & ~_Q_PENDING_MASK) | _Q_LOCKED_VAL;
-		val = atomic_cmpxchg_acquire(&lock->val, old, new);
+		new = (old & ~_Q_PENDING_MASK) | _Q_LOCKED_VAL;
+	} while (!atomic_try_cmpxchg_acquire (&lock->val, &old, new));
 
-		if (val == old)
-			return 1;
-	}
-	return 0;
+	return true;
 }
 #endif /* _Q_PENDING_BITS == 8 */
 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ