lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1345650689.2709.32.camel@bwh-desktop.uk.solarflarecom.com>
Date:	Wed, 22 Aug 2012 16:51:29 +0100
From:	Ben Hutchings <bhutchings@...arflare.com>
To:	David Laight <David.Laight@...LAB.COM>
CC:	"H. Peter Anvin" <hpa@...or.com>,
	Benjamin LaHaise <bcrl@...ck.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	David Miller <davem@...emloft.net>, <tglx@...utronix.de>,
	<mingo@...hat.com>, <netdev@...r.kernel.org>,
	<linux-net-drivers@...arflare.com>, <x86@...nel.org>
Subject: RE: [PATCH 2/3] x86_64: Define 128-bit memory-mapped I/O operations

On Wed, 2012-08-22 at 16:27 +0100, David Laight wrote:
> > Your architecture sounds similar to one I once worked on (Orion
> > Microsystems CNIC/OPA-2).  That architecture had a descriptor ring in
> > device memory, and a single trigger bit would move the head pointer.
> > 
> > We used write combining to write out a set of descriptors, and then
> > used
> > a non-write-combining write to do the final write which bumps the head
> > pointer.  The UC write flushes the write combiners ahead of it, so it
> > ends up with two transactions (one for the WC data and one for the UC
> > trigger) but it could frequently push quite a few descriptors in that
> > operation.
> 
> The code actually looks more like a normal ethernet ring interface
> with an 'owner' bit in each entry.
> So it is important to write the owner bit last.

You're confused.  The 'owner' field in the descriptor pointer is part of
the memory protection mechanism for user-level networking.  And we don't
have up to 1024 TX descriptors in a single ring, we have up to 1024
separate rings - in host memory, of course.  Which is why we have the
'TX push' feature to reduce latency for a currently empty TX queue.

> It might be possibly to set multiple ring entries in two TLPs
> by first writing all of them (maybe with write combining)
> but without changing the ownership of the first entry.
> Then doing a second transfer to update the owner bit it
> the first entry.
> The order of the writes in the first transfer would then not
> matter.
> 
> FWIW can you even guarantee to do an atomic 64bit PCIe transfer
> on many systems (without resorting to a dma unit).

On any architecture that implements readq and writeq these had better be
atomic.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ