linux-kernel - Re: [PATCH] fix a race condition in cancelable mcs spinlocks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LRH.2.02.1406021002090.1300@file01.intranet.prod.int.rdu2.redhat.com>
Date:	Mon, 2 Jun 2014 10:02:49 -0400 (EDT)
From:	Mikulas Patocka <mpatocka@...hat.com>
To:	John David Anglin <dave.anglin@...l.net>
cc:	Peter Zijlstra <peterz@...radead.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	jejb@...isc-linux.org, deller@....de, linux-parisc@...r.kernel.org,
	linux-kernel@...r.kernel.org, chegu_vinod@...com,
	paulmck@...ux.vnet.ibm.com, Waiman.Long@...com, tglx@...utronix.de,
	riel@...hat.com, akpm@...ux-foundation.org, davidlohr@...com,
	hpa@...or.com, andi@...stfloor.org, aswin@...com,
	scott.norton@...com, Jason Low <jason.low2@...com>
Subject: Re: [PATCH] fix a race condition in cancelable mcs spinlocks



On Mon, 2 Jun 2014, Mikulas Patocka wrote:

> 
> 
> On Sun, 1 Jun 2014, John David Anglin wrote:
> 
> > On 1-Jun-14, at 3:20 PM, Peter Zijlstra wrote:
> > 
> > > > If you write to some variable with ACCESS_ONCE and use cmpxchg or xchg at
> > > > the same time, you break it. ACCESS_ONCE doesn't take the hashed spinlock,
> > > > so, in this case, cmpxchg or xchg isn't really atomic at all.
> > > 
> > > And this is really the first place in the kernel that breaks like this?
> > > I've been using xchg() and cmpxchg() without such consideration for
> > > quite a while.
> > 
> > I believe Mikulas is correct.  Even in a controlled situation where a 
> > cmpxchg operation is used to implement pthread_spin_lock() in userspace, 
> > we found recently that the lock must be released with a cmpxchg 
> > operation and not a simple write on SMP systems. There is a race in the 
> > cache operations or instruction ordering that's not present with the 
> > ldcw instruction.
> > 
> > Dave
> > --
> > John David Anglin	dave.anglin@...l.net
> 
> That is strange.
> 
> Spinlock with cmpxchg on lock and a single write on unlock should work,
> assuming that cmpxchg doesn't write to the target address when it detects
> mismatch (the cmpxchg in the kernel syscall page doesn't do it, it
> nullifies the write instruction on mismatch).
> 
> Do you have some code that reproduces this misbehavior?
> 
> We really need to find out why does it behave this way:
> - is PA-RISC really out of order? (we used to believe that it is in-order
>   and we have empty barrier instructions in the kernel). Does adding the
>   "SYNC" instruction before the write in pthread_spin_unlock fix it?
> - does the processor performs nullified writes unconditionally? Does
>   moving the write in the cmpxchg implementation from the nullified slot
>   to is own branch fix it?
> - does adding a dummy "ldcw" instruction to an unrelated address fix it?
>   Is it that "ldcw" has some magic barrier properties?

- and there is "stw,o" instruction that does ordered store according to 
the specification, so we should test it too...

> I think we need to perform these tests and maybe some more to find out
> what really happened there...
> 
> BTW. in Debian 5 libc 2.7, pthread_spin_lock uses ldcw and 
> pthread_spin_unlock uses a single write (just like the kernel spinlock 
> implementation). In Debian-ports libc 2.18, both pthread_spin_lock and 
> pthread_spin_unlock call the kernel syscall page. What was the reason for 
> switching to a less efficient implementation?
> 
> Mikulas
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/