[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160426152844.GZ3448@twins.programming.kicks-ass.net>
Date: Tue, 26 Apr 2016 17:28:44 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Chris Metcalf <cmetcalf@...lanox.com>
Cc: torvalds@...ux-foundation.org, mingo@...nel.org,
tglx@...utronix.de, will.deacon@....com,
paulmck@...ux.vnet.ibm.com, boqun.feng@...il.com,
waiman.long@....com, fweisbec@...il.com,
linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
rth@...ddle.net, vgupta@...opsys.com, linux@....linux.org.uk,
egtvedt@...fundet.no, realmz6@...il.com,
ysato@...rs.sourceforge.jp, rkuo@...eaurora.org,
tony.luck@...el.com, geert@...ux-m68k.org, james.hogan@...tec.com,
ralf@...ux-mips.org, dhowells@...hat.com, jejb@...isc-linux.org,
mpe@...erman.id.au, schwidefsky@...ibm.com, dalias@...c.org,
davem@...emloft.net, jcmvbkbc@...il.com, arnd@...db.de,
dbueso@...e.de, fengguang.wu@...el.com
Subject: Re: [RFC][PATCH 22/31] locking,tile: Implement
atomic{,64}_fetch_{add,sub,and,or,xor}()
On Mon, Apr 25, 2016 at 04:54:34PM -0400, Chris Metcalf wrote:
> On 4/22/2016 5:04 AM, Peter Zijlstra wrote:
> > static inline int atomic_add_return(int i, atomic_t *v)
> > {
> > int val;
> > smp_mb(); /* barrier for proper semantics */
> > val = __insn_fetchadd4((void *)&v->counter, i) + i;
> > barrier(); /* the "+ i" above will wait on memory */
> >+ /* XXX smp_mb() instead, as per cmpxchg() ? */
> > return val;
> > }
>
> The existing code is subtle but I'm pretty sure it's not a bug.
>
> The tilegx architecture will take the "+ i" and generate an add instruction.
> The compiler barrier will make sure the add instruction happens before
> anything else that could touch memory, and the microarchitecture will make
> sure that the result of the atomic fetchadd has been returned to the core
> before any further instructions are issued. (The memory architecture is
> lazy, but when you feed a load through an arithmetic operation, we block
> issuing any further instructions until the add's operands are available.)
>
> This would not be an adequate memory barrier in general, since other loads
> or stores might still be in flight, even if the "val" operand had made it
> from memory to the core at this point. However, we have issued no other
> stores or loads since the previous memory barrier, so we know that there
> can be no other loads or stores in flight, and thus the compiler barrier
> plus arithmetic op is equivalent to a memory barrier here.
>
> In hindsight, perhaps a more substantial comment would have been helpful
> here. Unless you see something missing in my analysis, I'll plan to go
> ahead and add a suitable comment here :-)
>
> Otherwise, though just based on code inspection so far:
>
> Acked-by: Chris Metcalf <cmetcalf@...lanox.com> [for tile]
Thanks!
Just to verify; the new fetch-op thingies _do_ indeed need the extra
smp_mb() as per my patch, because there is no trailing instruction
depending on the completion of the load?
Powered by blists - more mailing lists