[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090527184041.GA22545@Krystal>
Date: Wed, 27 May 2009 14:40:41 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To: Catalin Marinas <catalin.marinas@....com>
Cc: Russell King - ARM Linux <linux@....linux.org.uk>,
Jamie Lokier <jamie@...reable.org>,
linux-arm-kernel@...ts.arm.linux.org.uk,
linux-kernel@...r.kernel.org,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: Broken ARM atomic ops wrt memory barriers (was : [PATCH] Add
cmpxchg support for ARMv6+ systems)
* Catalin Marinas (catalin.marinas@....com) wrote:
> On Tue, 2009-05-26 at 21:22 -0400, Mathieu Desnoyers wrote:
> > So, my questions is : is ARMv7 weak memory ordering model as weak as
> > Alpha ?
>
> I'm not familiar with Alpha but ARM allows a weakly ordered memory
> system (starting with ARMv6), it's up to the processor implementer to
> decide how weak but within the ARM ARM restrictions (section A3.8.2).
>
> I think the main difference with Alpha is that ARM doesn't do
> speculative writes, only speculative reads. The write cannot become
> visible to other observers in the same shareability domain before the
> instruction occurs in program order. But because of the write buffer,
> there is no guarantee on the order of two writes becoming visible to
> other observers in the same shareability domain. The reads from normal
> memory can happen speculatively (with a few restrictions)
>
> Summarising from the ARM ARM, there are two terms used:
>
> Address dependency - an address dependency exists when the value
> returned by a read access is used to compute the virtual address
> of a subsequent read or write access.
>
> Control dependency - a control dependency exists when the data
> value returned by a read access is used to determine the
> condition code flags, and the values of the flags are used for
> condition code checking to determine the address of a subsequent
> read access.
>
> The (simplified) memory ordering restrictions of two explicit accesses
> (where multiple observers are present and in the same shareability
> domain):
>
> * If there is an address dependency then the two memory accesses
> are observed in program order by any observer
> * If the value returned by a read access is used as data written
> by a subsequent write access, then the two memory accesses are
> observed in program order
> * It is impossible for an observer of a memory location to observe
> a write access to that memory location if that location would
> not be written to in a sequential execution of a program
>
> Outside of these restrictions, the processor implementer can do whatever
> it makes the CPU faster. To ensure the relative ordering between memory
> accesses (either read or write), the software should have DMB
> instructions.
>
Just to make sure :
for the read seqlock, a smp_rmb() is present. I assume that given there
is no address nor control dependency (as stated above) between the
seqlock value reads and the data access, these barriers cannot be
downgraded to a smp read barrier depend. It's a shame to have to do two
full dmb for every sequence lock. Are there any plans on the ARM side to
eventually add faster read barriers ?
Basically, on arm, a seqlock fast path takes 11 cycles on UP. If we add
the two dmb, it now takes 73 cycles.
Mathieu
> --
> Catalin
>
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists