netdev - Re: [PATCH 1/7] shm-signal: shared-memory signals

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090806205109.GA1330@ovro.caltech.edu>
Date:	Thu, 6 Aug 2009 13:51:09 -0700
From:	"Ira W. Snyder" <iws@...o.caltech.edu>
To:	Gregory Haskins <ghaskins@...ell.com>
Cc:	Arnd Bergmann <arnd@...db.de>, paulmck@...ux.vnet.ibm.com,
	alacrityvm-devel@...ts.sourceforge.net,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH 1/7] shm-signal: shared-memory signals

On Thu, Aug 06, 2009 at 09:11:15AM -0600, Gregory Haskins wrote:
> Hi Arnd,
> 
> >>> On 8/6/2009 at  9:56 AM, in message <200908061556.55390.arnd@...db.de>, Arnd
> Bergmann <arnd@...db.de> wrote: 
> > On Monday 03 August 2009, Gregory Haskins wrote:
> >> shm-signal provides a generic shared-memory based bidirectional
> >> signaling mechanism.  It is used in conjunction with an existing
> >> signal transport (such as posix-signals, interrupts, pipes, etc) to
> >> increase the efficiency of the transport since the state information
> >> is directly accessible to both sides of the link.  The shared-memory
> >> design provides very cheap access to features such as event-masking
> >> and spurious delivery mititgation, and is useful implementing higher
> >> level shared-memory constructs such as rings.
> > 
> > Looks like a very useful feature in general.
> 
> Thanks, I was hoping that would be the case.
> 
> > 
> >> +struct shm_signal_irq {
> >> +       __u8                  enabled;
> >> +       __u8                  pending;
> >> +       __u8                  dirty;
> >> +};
> > 
> > Won't this layout cause cache line ping pong? Other schemes I have
> > seen try to separate the bits so that each cache line is written to
> > by only one side.
> 
> It could possibly use some optimization in that regard.  I generally consider myself an expert at concurrent programming, but this lockless stuff is, um, hard ;)  I was going for correctness first.
> 
> Long story short, any suggestions on ways to split this up are welcome (particularly now, before the ABI is sealed ;)
> 
> > This gets much more interesting if the two sides
> > are on remote ends of an I/O link, e.g. using a nontransparent
> > PCI bridge, where you only want to send stores over the wire, but
> > never fetches or even read-modify-write cycles.
> 
> /me head explodes ;)
> 

I've actually implemented this idea for virtio. Read the virtio-over-PCI
patches I posted, and you'll see that the entire virtqueue
implementation NEVER uses reads across the PCI bus, only writes. The
slowpath configuration space uses reads, but the virtqueues themselves
are write-only.

Some trivial benchmarking against an earlier driver that did
writes+reads across the PCI bus showed that the write-only driver was
about 2x as fast. (Throughput increased from ~30MB/sec to ~65MB/sec).

I'm sure the write-only design was not the only change responsible for
the speedup, but it was definitely a contributing factor.

Ira
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html