linux-kernel - Re: [PATCH] mutexes: Add CONFIG_DEBUG_MUTEX_FASTPATH=y debug variant to debug SMP races

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131205065731.GA29736@hostway.ca>
Date:	Wed, 4 Dec 2013 22:57:31 -0800
From:	Simon Kirby <sim@...tway.ca>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Waiman Long <Waiman.Long@...com>,
	Ian Applegate <ia@...udflare.com>,
	Al Viro <viro@...iv.linux.org.uk>,
	Christoph Lameter <cl@...two.org>,
	Pekka Enberg <penberg@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Chris Mason <chris.mason@...ionio.com>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] mutexes: Add CONFIG_DEBUG_MUTEX_FASTPATH=y debug variant
 to debug SMP races

On Tue, Dec 03, 2013 at 09:52:33AM +0100, Ingo Molnar wrote:

> Indeed: this comes from mutex->count being separate from 
> mutex->wait_lock, and this should affect every architecture that has a 
> mutex->count fast-path implemented (essentially every architecture 
> that matters).
> 
> Such bugs should also magically go away with mutex debugging enabled.

Confirmed: I ran the reproducer with CONFIG_DEBUG_MUTEXES for a few
hours, and never got a single poison overwritten notice.

> I'd expect such bugs to be more prominent with unlucky object 
> size/alignment: if mutex->count lies on a separate cache line from 
> mutex->wait_lock.
> 
> Side note: this might be a valid light weight debugging technique, we 
> could add padding between the two fields to force them into separate 
> cache lines, without slowing it down.
> 
> Simon, would you be willing to try the fairly trivial patch below? 
> Please enable CONFIG_DEBUG_MUTEX_FASTPATH=y. Does your kernel fail 
> faster that way?

I didn't see much of a change other than the incremented poison byte is
now further in due to the padding, and it shows up in kmalloc-256.

I also tried with Linus' udelay() suggestion, below. With this, there
were many occurrences per second.

Simon-

diff --git a/kernel/mutex.c b/kernel/mutex.c
index d24105b..f65e735 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -25,6 +25,7 @@
 #include <linux/spinlock.h>
 #include <linux/interrupt.h>
 #include <linux/debug_locks.h>
+#include <linux/delay.h>
 
 /*
  * In the DEBUG case we are using the "NULL fastpath" for mutexes,
@@ -740,6 +741,11 @@ __mutex_unlock_common_slowpath(atomic_t *lock_count, int nested)
 		wake_up_process(waiter->task);
 	}
 
+	/* udelay a bit if the spinlock isn't contended */
+	if (lock->wait_lock.rlock.raw_lock.tickets.head + 1 ==
+	    lock->wait_lock.rlock.raw_lock.tickets.tail)
+		udelay(1);
+
 	spin_unlock_mutex(&lock->wait_lock, flags);
 }
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/