lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 14 Jan 2014 16:33:10 -0800
From:	Jason Low <jason.low2@...com>
To:	mingo@...hat.com, peterz@...radead.org, paulmck@...ux.vnet.ibm.com,
	Waiman.Long@...com, torvalds@...ux-foundation.org,
	tglx@...utronix.de, jason.low2@...com
Cc:	linux-kernel@...r.kernel.org, riel@...hat.com,
	akpm@...ux-foundation.org, davidlohr@...com, hpa@...or.com,
	aswin@...com, scott.norton@...com
Subject: [RFC 3/3] mutex: When there is no owner, stop spinning after too many tries

When running workloads that have high contention in mutexes on an 8 socket
machine, spinners would often spin for a long time with no lock owner.

One of the potential reasons for this is because a thread can be preempted
after clearing lock->owner but before releasing the lock, or preempted after
acquiring the mutex but before setting lock->owner. In those cases, the
spinner cannot check if owner is not on_cpu because lock->owner is NULL.

A solution that would address the preemption part of this problem would
be to disable preemption between acquiring/releasing the mutex and
setting/clearing the lock->owner. However, that will require adding overhead
to the mutex fastpath.

The solution used in this patch is to limit the # of times thread can spin on
lock->count when !owner.

The threshold used in this patch for each spinner was 128, which appeared to
be a generous value, but any suggestions on another method to determine
the threshold are welcomed.

Signed-off-by: Jason Low <jason.low2@...com>
---
 kernel/locking/mutex.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index b500cc7..9465604 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -43,6 +43,7 @@
  * mutex.
  */
 #define	MUTEX_SHOW_NO_WAITER(mutex)	(atomic_read(&(mutex)->count) >= 0)
+#define	MUTEX_SPIN_THRESHOLD		(128)
 
 void
 __mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key)
@@ -418,7 +419,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 	struct task_struct *task = current;
 	struct mutex_waiter waiter;
 	unsigned long flags;
-	int ret;
+	int ret, nr_spins = 0;
 	struct mspin_node node;
 
 	preempt_disable();
@@ -453,6 +454,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 	mspin_lock(MLOCK(lock), &node);
 	for (;;) {
 		struct task_struct *owner;
+		nr_spins++;
 
 		if (use_ww_ctx && ww_ctx->acquired > 0) {
 			struct ww_mutex *ww;
@@ -502,9 +504,11 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 		 * When there's no owner, we might have preempted between the
 		 * owner acquiring the lock and setting the owner field. If
 		 * we're an RT task that will live-lock because we won't let
-		 * the owner complete.
+		 * the owner complete. Additionally, when there is no owner,
+		 * stop spinning after too many tries.
 		 */
-		if (!owner && (need_resched() || rt_task(task))) {
+		if (!owner && (need_resched() || rt_task(task) ||
+		               nr_spins > MUTEX_SPIN_THRESHOLD)) {
 			mspin_unlock(MLOCK(lock), &node);
 			goto slowpath;
 		}
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ