lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 15 Jan 2014 18:45:30 -0800
From:	Jason Low <jason.low2@...com>
To:	mingo@...hat.com
Cc:	peterz@...radead.org, paulmck@...ux.vnet.ibm.com,
	Waiman.Long@...com, torvalds@...ux-foundation.org,
	tglx@...utronix.de, linux-kernel@...r.kernel.org, riel@...hat.com,
	akpm@...ux-foundation.org, davidlohr@...com, hpa@...or.com,
	aswin@...com, scott.norton@...com
Subject: Re: [RFC 3/3] mutex: When there is no owner, stop spinning after
 too many tries

On Tue, 2014-01-14 at 16:33 -0800, Jason Low wrote:
> When running workloads that have high contention in mutexes on an 8 socket
> machine, spinners would often spin for a long time with no lock owner.
> 
> One of the potential reasons for this is because a thread can be preempted
> after clearing lock->owner but before releasing the lock, or preempted after
> acquiring the mutex but before setting lock->owner. In those cases, the
> spinner cannot check if owner is not on_cpu because lock->owner is NULL.

Looks like a bigger source of !owner latency is in
__mutex_unlock_common_slowpath(). If __mutex_slowpath_needs_to_unlock(),
then the owner needs to acquire the wait_lock before setting lock->count
to 1. If the wait_lock is being contended, which is occurring with some
workloads on my box, then this can delay the owner from releasing
the lock by quite a bit.

Any comments on the below change which unlocks the mutex before taking
the lock->wait_lock to wake up a waiter? Thanks.

---
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index b500cc7..38f0eb0 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -723,10 +723,6 @@ __mutex_unlock_common_slowpath(atomic_t *lock_count, int nested)
        struct mutex *lock = container_of(lock_count, struct mutex, count);
        unsigned long flags;
 
-       spin_lock_mutex(&lock->wait_lock, flags);
-       mutex_release(&lock->dep_map, nested, _RET_IP_);
-       debug_mutex_unlock(lock);
-
        /*
         * some architectures leave the lock unlocked in the fastpath failure
         * case, others need to leave it locked. In the later case we have to
@@ -735,6 +731,10 @@ __mutex_unlock_common_slowpath(atomic_t *lock_count, int nested)
        if (__mutex_slowpath_needs_to_unlock())
                atomic_set(&lock->count, 1);
 
+       spin_lock_mutex(&lock->wait_lock, flags);
+       mutex_release(&lock->dep_map, nested, _RET_IP_);
+       debug_mutex_unlock(lock);
+
        if (!list_empty(&lock->wait_list)) {
                /* get the first entry from the wait-list: */
                struct mutex_waiter *waiter =


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ