netdev - Re: [PATCH 1/7] shm-signal: shared-memory signals

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 06 Aug 2009 09:11:15 -0600
From:	"Gregory Haskins" <ghaskins@...ell.com>
To:	"Arnd Bergmann" <arnd@...db.de>
Cc:	<paulmck@...ux.vnet.ibm.com>,
	<alacrityvm-devel@...ts.sourceforge.net>,
	<linux-kernel@...r.kernel.org>, <netdev@...r.kernel.org>
Subject: Re: [PATCH 1/7] shm-signal: shared-memory signals

Hi Arnd,

>>> On 8/6/2009 at  9:56 AM, in message <200908061556.55390.arnd@...db.de>, Arnd
Bergmann <arnd@...db.de> wrote: 
> On Monday 03 August 2009, Gregory Haskins wrote:
>> shm-signal provides a generic shared-memory based bidirectional
>> signaling mechanism.  It is used in conjunction with an existing
>> signal transport (such as posix-signals, interrupts, pipes, etc) to
>> increase the efficiency of the transport since the state information
>> is directly accessible to both sides of the link.  The shared-memory
>> design provides very cheap access to features such as event-masking
>> and spurious delivery mititgation, and is useful implementing higher
>> level shared-memory constructs such as rings.
> 
> Looks like a very useful feature in general.

Thanks, I was hoping that would be the case.

> 
>> +struct shm_signal_irq {
>> +       __u8                  enabled;
>> +       __u8                  pending;
>> +       __u8                  dirty;
>> +};
> 
> Won't this layout cause cache line ping pong? Other schemes I have
> seen try to separate the bits so that each cache line is written to
> by only one side.

It could possibly use some optimization in that regard.  I generally consider myself an expert at concurrent programming, but this lockless stuff is, um, hard ;)  I was going for correctness first.

Long story short, any suggestions on ways to split this up are welcome (particularly now, before the ABI is sealed ;)

> This gets much more interesting if the two sides
> are on remote ends of an I/O link, e.g. using a nontransparent
> PCI bridge, where you only want to send stores over the wire, but
> never fetches or even read-modify-write cycles.

/me head explodes ;)

> 
> Your code is probably optimal if you only communicate between host
> and guest code on the same CPU, but not so good if it crosses NUMA
> nodes or worse.

Yeah, I wont lie and say it wasn't designed primarily for the former case in mind (since it was my particular itch).  I would certainly appreciate any insight on ways to make it more generally applicable for things like the transparent bridge model, and/or NUMA, though.

> 
>> +struct shm_signal_desc {
>> +       __u32                 magic;
>> +       __u32                 ver;
>> +       struct shm_signal_irq irq[2];
>> +};
> 
> This data structure has implicit padding of two bytes at the end.
> How about adding another '__u16 reserved' to make it explicit?

Good idea.  Will fix.

> 
>> +	/*
>> +	 * We always mark the remote side as dirty regardless of whether
>> +	 * they need to be notified.
>> +	 */
>> +	irq->dirty = 1;
>> +	wmb();   /* dirty must be visible before we test the pending state */
>> +
>> +	if (irq->enabled && !irq->pending) {
>> +		rmb();
>> +
>> +		/*
>> +		 * If the remote side has enabled notifications, and we do
>> +		 * not see a notification pending, we must inject a new one.
>> +		 */
>> +		irq->pending = 1;
>> +		wmb(); /* make it visible before we do the injection */
>> +
>> +		s->ops->inject(s);
>> +	}
> 
> Barriers always confuse me, but the rmb() looks slightly wrong. AFAIU
> it only prevents reads after the barrier from being done before the
> barrier, but you don't do any reads after it.

Its probably overzealous barrier'ing on my part.  I had a conversation with Paul McKenney (CC'd) where I was wondering if a conditional was an implicit barrier.  His response was something to the effect of "on most arches, yes, but not all".  And we concluded that, to be conservative, there should be a rmb() after the if().

That said, tbh I am not sure if its actually needed.  Paul?

> 
> The (irq->enabled && !irq->pending) check could be done before the
> irq->dirty = 1 arrives at the bus, but that does not seem to hurt, it
> would at most cause a duplicate ->inject().
> 
> Regarding the scope of the barrier, did you intentionally use the
> global versions (rmb()/wmb()) and not the lighter single-system
> (smp_rmb()/smp_wmb()) versions? Your version should cope with remote
> links over PCI but looks otherwise optimized for local use, as I
> wrote above.

Yes, it was intentional.  Both for the remote case, as you point out.  Also for the case where local might be mismatched (for instance, a guest compiled as UP).

Thanks Arnd,
-Greg

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html