[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1522211620.7364.94.camel@kernel.crashing.org>
Date: Wed, 28 Mar 2018 15:33:40 +1100
From: Benjamin Herrenschmidt <benh@...nel.crashing.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Alexander Duyck <alexander.duyck@...il.com>,
Will Deacon <will.deacon@....com>,
Sinan Kaya <okaya@...eaurora.org>,
Arnd Bergmann <arnd@...db.de>, Jason Gunthorpe <jgg@...pe.ca>,
David Laight <David.Laight@...lab.com>,
Oliver <oohall@...il.com>,
"open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)"
<linuxppc-dev@...ts.ozlabs.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
Alexander Duyck <alexander.h.duyck@...hat.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: RFC on writel and writel_relaxed
On Tue, 2018-03-27 at 16:51 -1000, Linus Torvalds wrote:
> On Tue, Mar 27, 2018 at 3:03 PM, Benjamin Herrenschmidt
> <benh@...nel.crashing.org> wrote:
> >
> > The discussion at hand is about
> >
> > dma_buffer->foo = 1; /* WB */
> > writel(KICK, DMA_KICK_REGISTER); /* UC */
>
> Yes. That certainly is ordered on x86. In fact, afaik it's ordered
> even if that writel() might be of type WC, because that only delays
> writes, it doesn't move them earlier.
Ok so this is our answer ...
... snip ... (thanks for the background info !)
> Oh, the above UC case is absoutely guaranteed.
Good.
Then....
> The only issue really is that 99.9% of all testing gets done on x86
> unless you look at specific SoC drivers.
>
> On ARM, for example, there is likely little reason to care about x86
> memory ordering, because there is almost zero driver overlap between
> x86 and ARM.
>
> *Historically*, the reason for following the x86 IO ordering was
> simply that a lot of architectures used the drivers that were
> developed on x86. The alpha and powerpc workstations were *designed*
> with the x86 IO bus (PCI, then PCIe) and to work with the devices that
> came with it.
>
> ARM? PCIe is almost irrelevant. For ARM servers, if they ever take
> off, sure. But 99.99% of ARM is about their own SoC's, and so "x86
> test coverage" is simply not an issue.
>
> How much of an issue is it for Power? Maybe you decide it's not a big deal.
>
> Then all the above is almost irrelevant.
So the overlap may not be that NIL in practice :-) But even then that
doesn't matter as ARM has been happily implementing the same semantic
you describe above for years, as do we powerpc.
This is why, I want (with your agreement) to define clearly and once
and for all, that the Linux semantics of writel are that it is ordered
with previous writes to coherent memory (*)
This is already what ARM and powerpc provide, from what you say, what
x86 provides, I don't see any reason to keep that badly documented and
have drivers randomly growing useless wmb()'s because they don't think
it works on x86 without them !
Once that's sorted, let's tackle the problem of mmiowb vs. spin_unlock
and the problem of writel_relaxed semantics but as separate issues :-)
Also, can I assume the above ordering with writel() equally applies to
readl() or not ?
IE:
dma_buf->foo = 1;
readl(STUPID_DEVICE_DMA_KICK_ON_READ);
Also works on x86 ? (It does on power, maybe not on ARM).
Cheers,
Ben.
(*) From an Linux API perspective, all of this is only valid if the
memory was allocated by dma_alloc_coherent(). Anything obtained by
dma_map_something() might have been bounced bufferred or might require
extra cache flushes on some architectures, and thus needs
dma_sync_for_{cpu,device} calls.
Cheers,
Ben.
Powered by blists - more mailing lists