[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160309145119.GN6356@twins.programming.kicks-ass.net>
Date: Wed, 9 Mar 2016 15:51:19 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Vineet Gupta <Vineet.Gupta1@...opsys.com>
Cc: "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
linux-parisc@...r.kernel,
Andrew Morton <akpm@...ux-foundation.org>,
Helge Deller <deller@....de>, linux-kernel@...r.kernel.org,
stable@...r.kernel.org,
"James E.J. Bottomley" <jejb@...isc-linux.org>,
Pekka Enberg <penberg@...nel.org>, linux-mm@...ck.org,
Noam Camus <noamc@...hip.com>,
David Rientjes <rientjes@...gle.com>,
Christoph Lameter <cl@...ux.com>,
linux-snps-arc@...ts.infradead.org,
Joonsoo Kim <iamjoonsoo.kim@....com>
Subject: Re: [PATCH] mm: slub: Ensure that slab_unlock() is atomic
On Wed, Mar 09, 2016 at 06:52:45PM +0530, Vineet Gupta wrote:
> On Wednesday 09 March 2016 03:43 PM, Peter Zijlstra wrote:
> >> There is clearly a problem in slub code that it is pairing a test_and_set_bit()
> >> with a __clear_bit(). Latter can obviously clobber former if they are not a single
> >> instruction each unlike x86 or they use llock/scond kind of instructions where the
> >> interim store from other core is detected and causes a retry of whole llock/scond
> >> sequence.
> >
> > Yes, test_and_set_bit() + __clear_bit() is broken.
>
> But in SLUB: bit_spin_lock() + __bit_spin_unlock() is acceptable ? How so
> (ignoring the performance thing for discussion sake, which is a side effect of
> this implementation).
The sort answer is: Per definition. They are defined to work together,
which is what makes __clear_bit_unlock() such a special function.
> So despite the comment below in bit_spinlock.h I don't quite comprehend how this
> is allowable. And if say, by deduction, this is fine for LLSC or lock prefixed
> cases, then isn't this true in general for lot more cases in kernel, i.e. pairing
> atomic lock with non-atomic unlock ? I'm missing something !
x86 (and others) do in fact use non-atomic instructions for
spin_unlock(). But as this is all arch specific, we can make these
assumptions. Its just that generic code cannot rely on it.
So let me try and explain.
The problem as identified is:
CPU0 CPU1
bit_spin_lock() __bit_spin_unlock()
1:
/* fetch_or, r1 holds the old value */
spin_lock
load r1, addr
load r1, addr
bclr r2, r1, 1
store r2, addr
or r2, r1, 1
store r2, addr /* lost the store from CPU1 */
spin_unlock
and r1, 1
bnz 2 /* it was set, go wait */
ret
2:
load r1, addr
and r1, 1
bnz 2 /* wait until its not set */
b 1 /* try again */
For LL/SC we replace:
spin_lock
load r1, addr
...
store r2, addr
spin_unlock
With the (obvious):
1:
load-locked r1, addr
...
store-cond r2, addr
bnz 1 /* or whatever branch instruction is required to retry */
In this case the failure cannot happen, because the store from CPU1
would have invalidated the lock from CPU0 and caused the
store-cond to fail and retry the loop, observing the new value.
Powered by blists - more mailing lists