[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080708180500.e8a61231.billfink@mindspring.com>
Date: Tue, 8 Jul 2008 18:05:00 -0400
From: Bill Fink <billfink@...dspring.com>
To: Stephen Hemminger <stephen.hemminger@...tta.com>
Cc: Roland Dreier <rdreier@...co.com>,
Evgeniy Polyakov <johnpol@....mipt.ru>,
David Miller <davem@...emloft.net>, aglo@...i.umich.edu,
shemminger@...tta.com, netdev@...r.kernel.org, rees@...ch.edu,
bfields@...ldses.org
Subject: Re: setsockopt()
On Tue, 8 Jul 2008, Stephen Hemminger wrote:
> On Mon, 07 Jul 2008 23:29:31 -0700
> Roland Dreier <rdreier@...co.com> wrote:
>
> > Interesting... I'd not tried nuttcp before, and on my testbed, which is
> > a very high-bandwidth, low-RTT network (IP-over-InfiniBand with DDR IB,
> > so the network is capable of 16 Gbps, and the RTT is ~25 microseconds),
> > the difference between autotuning and not for nuttcp is huge (testing
> > with 2.6.26-rc8 plus some pending 2.6.27 patches that add checksum
> > offload, LSO and LRO to the IP-over-IB driver):
> >
> > nuttcp -T30 -i1 ends up with:
> >
> > 14465.0625 MB / 30.01 sec = 4043.6073 Mbps 82 %TX 2 %RX
> >
> > while setting the window even to 128 KB with
> > nuttcp -w128k -T30 -i1 ends up with:
> >
> > 36416.8125 MB / 30.00 sec = 10182.8137 Mbps 90 %TX 96 %RX
> >
> > so it's a factor of 2.5 with nuttcp. I've never seen other apps behave
> > like that -- for example NPtcp (netpipe) only gets slower when
> > explicitly setting the window size.
> >
> > Strange...
>
> I suspect that the link is so fast that the window growth isn't happening
> fast enough. With only a 30 second test, you probably barely made it
> out of TCP slow start.
Nah. 30 seconds is plenty of time. I got up to nearly 8 Gbps
in 4 seconds (see my test report in earlier message in this thread),
and that was on an ~72 ms RTT network path. Roland's IB network
only has a ~25 usec RTT.
BTW I believe there is one other important difference between the way
the tcp_rmem/tcp_wmem autotuning parameters are handled versus the way
the rmem_max/wmem_max parameters are used when explicitly setting the
socket buffer sizes. I believe the tcp_rmem/tcp_wmem autotuning maximum
parameters are hard limits, with the default maximum tcp_rmem setting
being ~170 KB and the default maximum tcp_wmem setting being 128 KB.
On the other hand, I believe the rmem_max/wmem_max determines the maximum
value allowed to be set via the SO_RCVBUF/SO_SNDBUF setsockopt() call.
But then Linux doubles the requested value, so when Roland specified
a "-w128" nuttcp parameter, he actually got a socket buffer size
of 256 KB, which would thus be double that available in the autotuning
case assuming the tcp_rmem/tcp_wmem settings are using their default
values. This could then account for a factor of 2 X between the two
test cases. The "-v" verbose option to nuttcp might shed some light
on this hypothesis.
-Bill
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists