lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed,  4 Jan 2023 21:19:52 -0500
From:   guoren@...nel.org
To:     peterz@...radead.org, longman@...hat.com, mingo@...hat.com
Cc:     linux-kernel@...r.kernel.org, guoren@...nel.org,
        Guo Ren <guoren@...ux.alibaba.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Will Deacon <will@...nel.org>
Subject: [PATCH V2] locking/qspinlock: Optimize pending state waiting for unlock

From: Guo Ren <guoren@...ux.alibaba.com>

When we're pending, we only care about lock value. The xchg_tail
wouldn't affect the pending state. That means the hardware thread
could stay in a sleep state and leaves the rest execution units'
resources of pipeline to other hardware threads. This situation is
the SMT scenarios in the same core. Not an entering low-power state
situation. Of course, the granularity between cores is "cacheline",
but the granularity between SMT hw threads of the same core could
be "byte" which internal LSU handles. For example, when a hw-thread
yields the resources of the core to other hw-threads, this patch
could help the hw-thread stay in the sleep state and prevent it
from being woken up by other hw-threads xchg_tail.

Link: https://lore.kernel.org/lkml/20221224120545.262989-1-guoren@kernel.org/
Signed-off-by: Guo Ren <guoren@...ux.alibaba.com>
Signed-off-by: Guo Ren <guoren@...nel.org>
Acked-by: Waiman Long <longman@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Boqun Feng <boqun.feng@...il.com>
Cc: Will Deacon <will@...nel.org>
Cc: Ingo Molnar <mingo@...hat.com>
---
Changes in v2:
 - Add acked tag
 - Optimize commit log
 - Add discussion Link tag
---
 kernel/locking/qspinlock.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 2b23378775fe..ebe6b8ec7cb3 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -371,7 +371,7 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
 	/*
 	 * We're pending, wait for the owner to go away.
 	 *
-	 * 0,1,1 -> 0,1,0
+	 * 0,1,1 -> *,1,0
 	 *
 	 * this wait loop must be a load-acquire such that we match the
 	 * store-release that clears the locked bit and create lock
@@ -380,7 +380,7 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
 	 * barriers.
 	 */
 	if (val & _Q_LOCKED_MASK)
-		atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_MASK));
+		smp_cond_load_acquire(&lock->locked, !VAL);
 
 	/*
 	 * take ownership and clear the pending bit.
-- 
2.36.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ