[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 02 Feb 2014 13:01:23 -0800
From: Jason Low <jason.low2@...com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>,
Waiman Long <Waiman.Long@...com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Rik van Riel <riel@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Davidlohr Bueso <davidlohr@...com>,
"H. Peter Anvin" <hpa@...or.com>, Andi Kleen <andi@...stfloor.org>,
"Chandramouleeswaran, Aswin" <aswin@...com>,
"Norton, Scott J" <scott.norton@...com>, chegu_vinod@...com
Subject: Re: [RFC][PATCH v2 5/5] mutex: Give spinners a chance to
spin_on_owner if need_resched() triggered while queued
On Fri, 2014-01-31 at 21:08 +0100, Peter Zijlstra wrote:
> On Fri, Jan 31, 2014 at 12:01:37PM -0800, Jason Low wrote:
> > Currently still getting soft lockups with the updated version.
>
> Bugger.. ok clearly I need to think harder still. I'm fairly sure this
> cancelation can work though, just seems tricky to get right :-)
Ok, I believe I have found a race condition between m_spin_lock() and
m_spin_unlock().
In m_spin_unlock(), we do "next = ACCESS_ONCE(node->next)". Then, if
next is not NULL, we proceed to set next->locked to 1.
A thread in m_spin_lock() in the unqueue path could execute
"next = cmpxchg(&prev->next, node, NULL)" after the thread in
m_spin_unlock() accesses its node->next and finds that it is not NULL.
Then, the thread in m_spin_lock() could check !node->locked before
the thread in m_spin_unlock() sets next->locked to 1.
The following addition change was able to solve the initial lockups that were
occurring when running fserver on a 2 socket box.
---
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 9eb4dbe..e71a84a 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -513,8 +513,13 @@ static void m_spin_unlock(struct m_spinlock **lock)
return;
next = ACCESS_ONCE(node->next);
- if (unlikely(next))
- break;
+
+ if (unlikely(next)) {
+ next = cmpxchg(&node->next, next, NULL);
+
+ if (next)
+ break;
+ }
arch_mutex_cpu_relax();
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists