netdev - Re: [PATCH 1/7] shm-signal: shared-memory signals

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200908061556.55390.arnd@arndb.de>
Date:	Thu, 6 Aug 2009 15:56:55 +0200
From:	Arnd Bergmann <arnd@...db.de>
To:	Gregory Haskins <ghaskins@...ell.com>
Cc:	linux-kernel@...r.kernel.org,
	alacrityvm-devel@...ts.sourceforge.net, netdev@...r.kernel.org
Subject: Re: [PATCH 1/7] shm-signal: shared-memory signals

On Monday 03 August 2009, Gregory Haskins wrote:
> shm-signal provides a generic shared-memory based bidirectional
> signaling mechanism.  It is used in conjunction with an existing
> signal transport (such as posix-signals, interrupts, pipes, etc) to
> increase the efficiency of the transport since the state information
> is directly accessible to both sides of the link.  The shared-memory
> design provides very cheap access to features such as event-masking
> and spurious delivery mititgation, and is useful implementing higher
> level shared-memory constructs such as rings.

Looks like a very useful feature in general.

> +struct shm_signal_irq {
> +       __u8                  enabled;
> +       __u8                  pending;
> +       __u8                  dirty;
> +};

Won't this layout cause cache line ping pong? Other schemes I have
seen try to separate the bits so that each cache line is written to
by only one side. This gets much more interesting if the two sides
are on remote ends of an I/O link, e.g. using a nontransparent
PCI bridge, where you only want to send stores over the wire, but
never fetches or even read-modify-write cycles.

Your code is probably optimal if you only communicate between host
and guest code on the same CPU, but not so good if it crosses NUMA
nodes or worse.

> +struct shm_signal_desc {
> +       __u32                 magic;
> +       __u32                 ver;
> +       struct shm_signal_irq irq[2];
> +};

This data structure has implicit padding of two bytes at the end.
How about adding another '__u16 reserved' to make it explicit?

> +	/*
> +	 * We always mark the remote side as dirty regardless of whether
> +	 * they need to be notified.
> +	 */
> +	irq->dirty = 1;
> +	wmb();   /* dirty must be visible before we test the pending state */
> +
> +	if (irq->enabled && !irq->pending) {
> +		rmb();
> +
> +		/*
> +		 * If the remote side has enabled notifications, and we do
> +		 * not see a notification pending, we must inject a new one.
> +		 */
> +		irq->pending = 1;
> +		wmb(); /* make it visible before we do the injection */
> +
> +		s->ops->inject(s);
> +	}

Barriers always confuse me, but the rmb() looks slightly wrong. AFAIU
it only prevents reads after the barrier from being done before the
barrier, but you don't do any reads after it.

The (irq->enabled && !irq->pending) check could be done before the
irq->dirty = 1 arrives at the bus, but that does not seem to hurt, it
would at most cause a duplicate ->inject().

Regarding the scope of the barrier, did you intentionally use the
global versions (rmb()/wmb()) and not the lighter single-system
(smp_rmb()/smp_wmb()) versions? Your version should cope with remote
links over PCI but looks otherwise optimized for local use, as I
wrote above.

	Arnd <><
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html