linux-kernel - Re: [PATCH] mutexes: Add CONFIG_DEBUG_MUTEX_FASTPATH=y debug variant to debug SMP races

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131204091903.GA18675@hostway.ca>
Date:	Wed, 4 Dec 2013 01:19:03 -0800
From:	Simon Kirby <sim@...tway.ca>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Waiman Long <Waiman.Long@...com>,
	Ian Applegate <ia@...udflare.com>,
	Al Viro <viro@...iv.linux.org.uk>,
	Christoph Lameter <cl@...two.org>,
	Pekka Enberg <penberg@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Chris Mason <chris.mason@...ionio.com>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] mutexes: Add CONFIG_DEBUG_MUTEX_FASTPATH=y debug variant
 to debug SMP races

On Tue, Dec 03, 2013 at 10:10:29AM -0800, Linus Torvalds wrote:

> On Tue, Dec 3, 2013 at 12:52 AM, Ingo Molnar <mingo@...nel.org> wrote:
> >
> > I'd expect such bugs to be more prominent with unlucky object
> > size/alignment: if mutex->count lies on a separate cache line from
> > mutex->wait_lock.
> 
> I doubt that makes much of a difference. It's still just "CPU cycles"
> away, and the window will be tiny unless you have multi-socket
> machines and/or are just very unlucky.
> 
> For stress-testing, it would be much better to use some hack like
> 
>     /* udelay a bit if the spinlock isn't contended */
>     if (mutex->wait_lock.ticket.head+1 == mutex->wait_lock.ticket.tail)
>         udelay(1);
> 
> in __mutex_unlock_common() just before the spin_unlock(). Make the
> window really *big*.

I haven't had a chance yet to do much testing of the proposed race fix
and race enlarging patches, but I did write a tool to reproduce the race.
I started it running and left for dinner, and sure enough, it actually
seems to work on plain 3.12 on a Dell PowerEdge r410 w/dual E5520,
reproducing "Poison overwritten" at a rate of about once every 15 minutes
(running 6 in parallel, booted with "slub_debug").

I have no idea if actually relying on tsc alignment between cores and
sockets is a reasonable idea these days, but it seems to work. I first
used a read() on a pipe close()d by the other process to synchronize
them, but this didn't seem to work as well as busy-waiting until the
timestamp counters pass a previously-decided-upon start time.

Meanwhile, I still don't understand how moving the unlock _up_ to cover
less of the code can solve the race, but I will stare at your long
explanation more tomorrow.

Simon-

View attachment "piperace.c" of type "text/x-csrc" (1653 bytes)