[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090806205109.GA1330@ovro.caltech.edu>
Date: Thu, 6 Aug 2009 13:51:09 -0700
From: "Ira W. Snyder" <iws@...o.caltech.edu>
To: Gregory Haskins <ghaskins@...ell.com>
Cc: Arnd Bergmann <arnd@...db.de>, paulmck@...ux.vnet.ibm.com,
alacrityvm-devel@...ts.sourceforge.net,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH 1/7] shm-signal: shared-memory signals
On Thu, Aug 06, 2009 at 09:11:15AM -0600, Gregory Haskins wrote:
> Hi Arnd,
>
> >>> On 8/6/2009 at 9:56 AM, in message <200908061556.55390.arnd@...db.de>, Arnd
> Bergmann <arnd@...db.de> wrote:
> > On Monday 03 August 2009, Gregory Haskins wrote:
> >> shm-signal provides a generic shared-memory based bidirectional
> >> signaling mechanism. It is used in conjunction with an existing
> >> signal transport (such as posix-signals, interrupts, pipes, etc) to
> >> increase the efficiency of the transport since the state information
> >> is directly accessible to both sides of the link. The shared-memory
> >> design provides very cheap access to features such as event-masking
> >> and spurious delivery mititgation, and is useful implementing higher
> >> level shared-memory constructs such as rings.
> >
> > Looks like a very useful feature in general.
>
> Thanks, I was hoping that would be the case.
>
> >
> >> +struct shm_signal_irq {
> >> + __u8 enabled;
> >> + __u8 pending;
> >> + __u8 dirty;
> >> +};
> >
> > Won't this layout cause cache line ping pong? Other schemes I have
> > seen try to separate the bits so that each cache line is written to
> > by only one side.
>
> It could possibly use some optimization in that regard. I generally consider myself an expert at concurrent programming, but this lockless stuff is, um, hard ;) I was going for correctness first.
>
> Long story short, any suggestions on ways to split this up are welcome (particularly now, before the ABI is sealed ;)
>
> > This gets much more interesting if the two sides
> > are on remote ends of an I/O link, e.g. using a nontransparent
> > PCI bridge, where you only want to send stores over the wire, but
> > never fetches or even read-modify-write cycles.
>
> /me head explodes ;)
>
I've actually implemented this idea for virtio. Read the virtio-over-PCI
patches I posted, and you'll see that the entire virtqueue
implementation NEVER uses reads across the PCI bus, only writes. The
slowpath configuration space uses reads, but the virtqueues themselves
are write-only.
Some trivial benchmarking against an earlier driver that did
writes+reads across the PCI bus showed that the write-only driver was
about 2x as fast. (Throughput increased from ~30MB/sec to ~65MB/sec).
I'm sure the write-only design was not the only change responsible for
the speedup, but it was definitely a contributing factor.
Ira
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists