[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140417140036.GK11096@twins.programming.kicks-ass.net>
Date: Thu, 17 Apr 2014 16:00:36 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Will Deacon <will.deacon@....com>
Cc: linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
arnd@...db.de, monstr@...str.eu, dhowells@...hat.com,
broonie@...aro.org, benh@...nel.crashing.org,
paulmck@...ux.vnet.ibm.com
Subject: Re: [PATCH 00/18] Cross-architecture definitions of relaxed MMIO
accessors
On Thu, Apr 17, 2014 at 02:44:03PM +0100, Will Deacon wrote:
> Hello,
>
> This RFC series attempts to define a portable (i.e. cross-architecture)
> definition of the {readX,writeX}_relaxed MMIO accessor functions. These
> functions are already in widespread use amongst drivers (mainly those supporting
> devices embedded in ARM SoCs), but lack any well-defined semantics and,
> subsequently, any portable definitions to allow these drivers to be compiled for
> other architectures.
>
> The two main motivations for this series are:
>
> (1) To promote use of the _relaxed MMIO accessors on weakly-ordered
> architectures, where they can bring significant performance improvements
> over their non-relaxed counterparts.
>
> (2) To allow COMPILE_TEST to build drivers using the relaxed accessors across
> all architectures.
>
> The proposed semantics largely match exactly those provided by the ARM
> implementation (i.e. no weaker), with one exception (see below).
>
> Informally:
>
> - Relaxed accesses to the same device are ordered with respect to each other.
>
> - Relaxed accesses are *not* guaranteed to be ordered with respect to normal
> memory accesses (e.g. DMA buffers -- this is what gives us the performance
> boost over the non-relaxed versions).
>
> - Relaxed accesses are not guaranteed to be ordered with respect to
> LOCK/UNLOCK operations.
>
> In actual fact, the relaxed accessors *are* ordered with respect to LOCK/UNLOCK
> operations on ARM[64], but I have added this constraint for the benefit of
> PowerPC, which has expensive I/O barriers in the spin_unlock path for the
> non-relaxed accessors.
>
> A corollary to this is that mmiowb() probably needs rethinking. As it currently
> stands, an mmiowb() is required to order MMIO writes to a device from multiple
> CPUs, even if that device is protected by a lock. However, this isn't often used
> in practice, leading to PowerPC implementing both mmiowb() *and* synchronising
> I/O in spin_unlock.
>
> I would propose making the non-relaxed I/O accessors ordered with respect to
> LOCK/UNLOCK, leaving mmiowb() to be used with the relaxed accessors, if
> required, but would welcome thoughts/suggestions on this topic.
So the non-relaxed ops already imply the expensive I/O barrier (mmiowb?)
and therefore, PPC can drop it from spin_unlock()?
Also, I read mmiowb() as MMIO-write-barrier(), what do we have to
order/contain mmio-reads?
I have _0_ experience with MMIO, so I've no idea if ordering/containing
reads is silly or not.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists