[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160810191757.GA4952@linux-80c1.suse>
Date: Wed, 10 Aug 2016 12:17:57 -0700
From: Davidlohr Bueso <dave@...olabs.net>
To: Manfred Spraul <manfred@...orfullife.com>
Cc: Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Michael Ellerman <mpe@...erman.id.au>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Susanne Spraul <1vier1@....de>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: spin_lock implicit/explicit memory barrier
On Wed, 10 Aug 2016, Manfred Spraul wrote:
>On 08/10/2016 02:05 AM, Benjamin Herrenschmidt wrote:
>>On Tue, 2016-08-09 at 20:52 +0200, Manfred Spraul wrote:
>>>Hi Benjamin, Hi Michael,
>>>
>>>regarding commit 51d7d5205d33 ("powerpc: Add smp_mb() to
>>>arch_spin_is_locked()"):
>>>
>>>For the ipc/sem code, I would like to replace the spin_is_locked() with
>>>a smp_load_acquire(), see:
>>>
>>>http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/tree/ipc/sem.c#n367
>>>
>>>http://www.ozlabs.org/~akpm/mmots/broken-out/ipc-semc-fix-complex_count-vs-simple-op-race.patch
>>>
>>>To my understanding, I must now add a smp_mb(), otherwise it would be
>>>broken on PowerPC:
>>>
>>>The approach that the memory barrier is added into spin_is_locked()
>>>doesn't work because the code doesn't use spin_is_locked().
>>>
>>>Correct?
>>Right, otherwise you aren't properly ordered. The current powerpc locks provide
>>good protection between what's inside vs. what's outside the lock but not vs.
>>the lock *value* itself, so if, like you do in the sem code, use the lock
>>value as something that is relevant in term of ordering, you probably need
>>an explicit full barrier.
But the problem here is with spin_unlock_wait() (for ll/sc spin_lock) not seeing the
store that makes the lock visibly taken and both threads end up exiting out of sem_lock();
similar scenario to the spin_is_locked commit mentioned above, which is crossing of
locks.
Now that spin_unlock_wait() always implies at least an load-acquire barrier (for both
ticket and qspinlocks, which is still x86 only), we wait on the full critical region.
So this patch takes this locking scheme:
CPU0 CPU1
spin_lock(l) spin_lock(L)
spin_unlock_wait(L) if (spin_is_locked(l))
foo() foo()
... and converts it now to:
CPU0 CPU1
complex_mode = true spin_lock(l)
smp_mb() <--- do we want a smp_mb() here?
spin_unlock_wait(l) if (!smp_load_acquire(complex_mode))
foo() foo()
We should not be doing an smp_mb() right after a spin_lock(), makes no sense. The
spinlock machinery should guarantee us the barriers in the unorthodox locking cases,
such as this.
Thanks,
Davidlohr
Powered by blists - more mailing lists