linux-kernel - [PATCH] locking/osq_lock: Minimize access to node->prev's cacheline

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20231221011953.1611615-1-longman@redhat.com>
Date: Wed, 20 Dec 2023 20:19:53 -0500
From: Waiman Long <longman@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Will Deacon <will@...nel.org>,
	Boqun Feng <boqun.feng@...il.com>
Cc: linux-kernel@...r.kernel.org,
	Zeng Heng <zengheng4@...wei.com>,
	Waiman Long <longman@...hat.com>
Subject: [PATCH] locking/osq_lock: Minimize access to node->prev's cacheline

When CONFIG_PARAVIRT_SPINLOCKS is on, osq_lock() will spin on both
node's and node->prev's cachelines while waiting for its node->locked
value to be set.  That can cause cacheline contention with another
CPU modifying the node->prev's cacheline. The reason for accessing
node->prev's cacheline is to read the node->prev->cpu value as a
parameter value for the vcpu_is_preempted() call. However this value
is a constant as long as node->prev doesn't change.

Minimize that contention by caching the node->prev and node->prev->cpu
values and updating the cached values and accessing node->prev's
cacheline iff node->prev changes.

By running an osq_lock/unlock microbenchmark which repeatedly calls
osq_lock/unlock in a loop with 96 locking threads running on a 2-socket
96-CPU machine, the locking rates before and after this patch were:

  Before patch: 1701 kops/s
  After patch:  3052 kops/s
  % Change:     +79.4%

It can be seen that this patch enables a rather significant performance
boost to the OSQ lock.

Signed-off-by: Waiman Long <longman@...hat.com>
---
 kernel/locking/osq_lock.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index d5610ad52b92..6d7218f4b6e4 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -87,12 +87,29 @@ osq_wait_next(struct optimistic_spin_queue *lock,
 	return next;
 }
 
+#ifndef vcpu_is_preempted
+#define prev_vcpu_is_preempted(n, p, c)	false
+#else
+static inline bool prev_vcpu_is_preempted(struct optimistic_spin_node *node,
+					  struct optimistic_spin_node **pprev,
+					  int *ppvcpu)
+{
+	struct optimistic_spin_node *prev = READ_ONCE(node->prev);
+
+	if (prev != *pprev) {
+		*pprev = prev;
+		*ppvcpu = node_cpu(prev);
+	}
+	return vcpu_is_preempted(*ppvcpu);
+}
+#endif
+
 bool osq_lock(struct optimistic_spin_queue *lock)
 {
 	struct optimistic_spin_node *node = this_cpu_ptr(&osq_node);
 	struct optimistic_spin_node *prev, *next;
 	int curr = encode_cpu(smp_processor_id());
-	int old;
+	int old, pvcpu;
 
 	node->locked = 0;
 	node->next = NULL;
@@ -110,6 +127,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
 
 	prev = decode_cpu(old);
 	node->prev = prev;
+	pvcpu = node_cpu(prev);
 
 	/*
 	 * osq_lock()			unqueue
@@ -141,7 +159,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
 	 * polling, be careful.
 	 */
 	if (smp_cond_load_relaxed(&node->locked, VAL || need_resched() ||
-				  vcpu_is_preempted(node_cpu(node->prev))))
+				  prev_vcpu_is_preempted(node, &prev, &pvcpu)))
 		return true;
 
 	/* unqueue */
-- 
2.39.3