[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.02.1406021002090.1300@file01.intranet.prod.int.rdu2.redhat.com>
Date: Mon, 2 Jun 2014 10:02:49 -0400 (EDT)
From: Mikulas Patocka <mpatocka@...hat.com>
To: John David Anglin <dave.anglin@...l.net>
cc: Peter Zijlstra <peterz@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
jejb@...isc-linux.org, deller@....de, linux-parisc@...r.kernel.org,
linux-kernel@...r.kernel.org, chegu_vinod@...com,
paulmck@...ux.vnet.ibm.com, Waiman.Long@...com, tglx@...utronix.de,
riel@...hat.com, akpm@...ux-foundation.org, davidlohr@...com,
hpa@...or.com, andi@...stfloor.org, aswin@...com,
scott.norton@...com, Jason Low <jason.low2@...com>
Subject: Re: [PATCH] fix a race condition in cancelable mcs spinlocks
On Mon, 2 Jun 2014, Mikulas Patocka wrote:
>
>
> On Sun, 1 Jun 2014, John David Anglin wrote:
>
> > On 1-Jun-14, at 3:20 PM, Peter Zijlstra wrote:
> >
> > > > If you write to some variable with ACCESS_ONCE and use cmpxchg or xchg at
> > > > the same time, you break it. ACCESS_ONCE doesn't take the hashed spinlock,
> > > > so, in this case, cmpxchg or xchg isn't really atomic at all.
> > >
> > > And this is really the first place in the kernel that breaks like this?
> > > I've been using xchg() and cmpxchg() without such consideration for
> > > quite a while.
> >
> > I believe Mikulas is correct. Even in a controlled situation where a
> > cmpxchg operation is used to implement pthread_spin_lock() in userspace,
> > we found recently that the lock must be released with a cmpxchg
> > operation and not a simple write on SMP systems. There is a race in the
> > cache operations or instruction ordering that's not present with the
> > ldcw instruction.
> >
> > Dave
> > --
> > John David Anglin dave.anglin@...l.net
>
> That is strange.
>
> Spinlock with cmpxchg on lock and a single write on unlock should work,
> assuming that cmpxchg doesn't write to the target address when it detects
> mismatch (the cmpxchg in the kernel syscall page doesn't do it, it
> nullifies the write instruction on mismatch).
>
> Do you have some code that reproduces this misbehavior?
>
> We really need to find out why does it behave this way:
> - is PA-RISC really out of order? (we used to believe that it is in-order
> and we have empty barrier instructions in the kernel). Does adding the
> "SYNC" instruction before the write in pthread_spin_unlock fix it?
> - does the processor performs nullified writes unconditionally? Does
> moving the write in the cmpxchg implementation from the nullified slot
> to is own branch fix it?
> - does adding a dummy "ldcw" instruction to an unrelated address fix it?
> Is it that "ldcw" has some magic barrier properties?
- and there is "stw,o" instruction that does ordered store according to
the specification, so we should test it too...
> I think we need to perform these tests and maybe some more to find out
> what really happened there...
>
> BTW. in Debian 5 libc 2.7, pthread_spin_lock uses ldcw and
> pthread_spin_unlock uses a single write (just like the kernel spinlock
> implementation). In Debian-ports libc 2.18, both pthread_spin_lock and
> pthread_spin_unlock call the kernel syscall page. What was the reason for
> switching to a less efficient implementation?
>
> Mikulas
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists