[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.10.0906050959380.23895@gentwo.org>
Date:	Fri, 5 Jun 2009 10:18:15 -0400 (EDT)
From:	Christoph Lameter <cl@...ux-foundation.org>
To:	David Miller <davem@...emloft.net>
cc:	rdreier@...co.com, netdev@...r.kernel.org, yosefe@...taire.COM
Subject: Re: IPoIB: Fix multicast packet drops before join is complete
On Thu, 4 Jun 2009, David Miller wrote:
> From: Christoph Lameter <cl@...ux-foundation.org>
> Date: Thu, 4 Jun 2009 11:52:48 -0400 (EDT)
>
> > On Wed, 3 Jun 2009, David Miller wrote:
> >
> >> We don't do this for ARP, for example.  We have a 3 packet limit just
> >> like IPoIB implements for multicast here.
> >
> > ARP is a management protocol with specific semantics. Socket protocols are
> > dealing with streams of datagrams.
>
> Go look at what the ARP backlog queue actually does, then come back to
> this conversation (hint: it's not a backlog for ARP packets, it's
> a backlog for packets waiting for ARP to resolve).
ARP is tied to managing small chunks of information about the network
infrastructure. Buffering the first few and throwing the rest away is
appropriate there for what the ARP protocol intends to do.
UDP multicasting can be used for streaming information. And right now the
IPoIB layer is dropping thousands of packets whenever there was a pause of
a few minutes or when a new multicast group is used and there is some
delay that the network need to reestablish the multicast route.
On IPoIB the app can send without being throttled to the speed supported
by the hardware in these cases. The faster the cpu we get the more packets
will be dropped in these bursts. The socket layer has an option to not
wait using MSG_DONTWAIT but in these cases we are not honoring that the
flag is not set. We simply drop the packets.
If UDP multicasting is used for a purpose like ARP then this is
appropriate but UDP multicasting has a variety of uses. If you want ARP
style semantics then the buffer size can be limited in such a way that
only 3 packets are bufferd by setting SO_SNDBUF and using MSG_DONTWAIT.
But without the patch this method is forced upon all uses of UDP
multicast for IPoIB layer.
And strangely if you use a 1G NIC everything is fine and the socket layer
properly throttles for packet bursts.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists
 
