[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1345650689.2709.32.camel@bwh-desktop.uk.solarflarecom.com>
Date: Wed, 22 Aug 2012 16:51:29 +0100
From: Ben Hutchings <bhutchings@...arflare.com>
To: David Laight <David.Laight@...LAB.COM>
CC: "H. Peter Anvin" <hpa@...or.com>,
Benjamin LaHaise <bcrl@...ck.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
David Miller <davem@...emloft.net>, <tglx@...utronix.de>,
<mingo@...hat.com>, <netdev@...r.kernel.org>,
<linux-net-drivers@...arflare.com>, <x86@...nel.org>
Subject: RE: [PATCH 2/3] x86_64: Define 128-bit memory-mapped I/O operations
On Wed, 2012-08-22 at 16:27 +0100, David Laight wrote:
> > Your architecture sounds similar to one I once worked on (Orion
> > Microsystems CNIC/OPA-2). That architecture had a descriptor ring in
> > device memory, and a single trigger bit would move the head pointer.
> >
> > We used write combining to write out a set of descriptors, and then
> > used
> > a non-write-combining write to do the final write which bumps the head
> > pointer. The UC write flushes the write combiners ahead of it, so it
> > ends up with two transactions (one for the WC data and one for the UC
> > trigger) but it could frequently push quite a few descriptors in that
> > operation.
>
> The code actually looks more like a normal ethernet ring interface
> with an 'owner' bit in each entry.
> So it is important to write the owner bit last.
You're confused. The 'owner' field in the descriptor pointer is part of
the memory protection mechanism for user-level networking. And we don't
have up to 1024 TX descriptors in a single ring, we have up to 1024
separate rings - in host memory, of course. Which is why we have the
'TX push' feature to reduce latency for a currently empty TX queue.
> It might be possibly to set multiple ring entries in two TLPs
> by first writing all of them (maybe with write combining)
> but without changing the ownership of the first entry.
> Then doing a second transfer to update the owner bit it
> the first entry.
> The order of the writes in the first transfer would then not
> matter.
>
> FWIW can you even guarantee to do an atomic 64bit PCIe transfer
> on many systems (without resorting to a dma unit).
On any architecture that implements readq and writeq these had better be
atomic.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists