[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190613180029.GO3436@hirez.programming.kicks-ass.net>
Date: Thu, 13 Jun 2019 20:00:29 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Alan Stern <stern@...land.harvard.edu>
Cc: David Howells <dhowells@...hat.com>, akiyks@...il.com,
andrea.parri@...rulasolutions.com, boqun.feng@...il.com,
dlustig@...dia.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
npiggin@...il.com, paulmck@...ux.ibm.com, will.deacon@....com,
paul.burton@...s.com, linux-kernel@...r.kernel.org,
torvalds@...ux-foundation.org
Subject: Re: [PATCH v2 0/4] atomic: Fixes to smp_mb__{before,after}_atomic()
and mips.
On Thu, Jun 13, 2019 at 12:58:11PM -0400, Alan Stern wrote:
> On Thu, 13 Jun 2019, David Howells wrote:
>
> > Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > > Basically we fail for:
> > >
> > > *x = 1;
> > > atomic_inc(u);
> > > smp_mb__after_atomic();
> > > r0 = *y;
> > >
> > > Because, while the atomic_inc() implies memory order, it
> > > (surprisingly) does not provide a compiler barrier. This then allows
> > > the compiler to re-order like so:
> >
> > To quote memory-barriers.txt:
> >
> > (*) smp_mb__before_atomic();
> > (*) smp_mb__after_atomic();
> >
> > These are for use with atomic (such as add, subtract, increment and
> > decrement) functions that don't return a value, especially when used for
> > reference counting. These functions do not imply memory barriers.
> >
> > so it's entirely to be expected?
>
> The text is perhaps ambiguous. It means that the atomic functions
> which don't return values -- like atomic_inc() -- do not imply memory
> barriers. It doesn't mean that smp_mb__before_atomic() and
> smp_mb__after_atomic() do not imply memory barriers.
>
> The behavior Peter described is not to be expected. The expectation is
> that the smp_mb__after_atomic() in the example should force the "*x =
> 1" store to execute before the "r0 = *y" load. But on current x86 it
> doesn't force this, for the reason explained in the description.
Indeed, thanks Alan.
The other other approach would be to upgrade smp_mb__{before,after}_mb()
to actual full memory barriers on x86, but that seems quite rediculous
since atomic_inc() already does all the expensive bits and is only
missing the compiler barrier.
That would result in code like:
mov $1, x
lock inc u
lock addl $0, -4(%rsp) # aka smp_mb()
mov y, %r
which is really quite silly.
And as noted in the Changelog, about half the non-value returning
atomics already implied the compiler barrier anyway.
Powered by blists - more mailing lists