lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20110310002458.5a94f563.billfink@mindspring.com>
Date:	Thu, 10 Mar 2011 00:24:58 -0500
From:	Bill Fink <billfink@...dspring.com>
To:	Lucas Nussbaum <lucas.nussbaum@...ia.fr>
Cc:	Injong Rhee <rhee@...u.edu>,
	Stephen Hemminger <shemminger@...tta.com>,
	David Miller <davem@...emloft.net>, xiyou.wangcong@...il.com,
	netdev@...r.kernel.org, sangtae.ha@...il.com
Subject: Re: [PATCH] Make CUBIC Hystart more robust to RTT variations

On Wed, 9 Mar 2011, Lucas Nussbaum wrote:

> On 08/03/11 at 20:30 -0500, Injong Rhee wrote:
> > Now, both tools can be wrong. But that is not catastrophic since
> > congestion avoidance can kick in to save the day. In a pipe where no
> > other flows are competing, then exiting slow start too early can
> > slow things down as the window can be still too small. But that is
> > in fact when delays are most reliable. So those tests that say bad
> > performance with hystart are in fact, where hystart is supposed to
> > perform well.
> 
> Hi,
> 
> In my setup, there is no congestion at all (except the buffer bloat).
> Without Hystart, transferring 8 Gb of data takes 9s, with CUBIC exiting
> slow start at ~2000 packets.
> With Hystart, transferring 8 Gb of data takes 19s, with CUBIC exiting
> slow start at ~20 packets.
> I don't think that this is "hystart performing well". We could just as
> well remove slow start completely, and only do congestion avoidance,
> then.
> 
> While I see the value in Hystart, it's clear that there are some flaws
> in the current implementation. It probably makes sense to disable
> hystart by default until those problems are fixed.

Here are some tests I performed across real networks, where
congestion is generally not an issue, with a 2.6.35 kernel on
the transmit side.

8 GB transfer across an 18 ms RTT path with autotuning and hystart:

i7test7% nuttcp -n8g -i1 192.168.1.23
  517.9375 MB /   1.00 sec = 4344.6096 Mbps     0 retrans
  688.4375 MB /   1.00 sec = 5775.1998 Mbps     0 retrans
  692.9375 MB /   1.00 sec = 5812.7462 Mbps     0 retrans
  698.0625 MB /   1.00 sec = 5855.8078 Mbps     0 retrans
  699.8750 MB /   1.00 sec = 5871.0123 Mbps     0 retrans
  710.5625 MB /   1.00 sec = 5960.5707 Mbps     0 retrans
  728.8125 MB /   1.00 sec = 6113.7652 Mbps     0 retrans
  751.3750 MB /   1.00 sec = 6302.9210 Mbps     0 retrans
  783.8750 MB /   1.00 sec = 6575.6201 Mbps     0 retrans
  825.1875 MB /   1.00 sec = 6921.8145 Mbps     0 retrans
  875.4375 MB /   1.00 sec = 7343.9811 Mbps     0 retrans

 8192.0000 MB /  11.26 sec = 6102.4718 Mbps 11 %TX 28 %RX 0 retrans 18.92 msRTT

Ramps up quickly to a little under 6 Gbps, then increases more
slowly to 7+ Gbps, with no TCP retransmissions.

8 GB transfer across an 18 ms RTT path with 40 MB socket buffer and hystart:

i7test7% nuttcp -n8g -w40m -i1 192.168.1.23
  970.0625 MB /   1.00 sec = 8136.8475 Mbps     0 retrans
 1181.1875 MB /   1.00 sec = 9909.0045 Mbps     0 retrans
 1181.2500 MB /   1.00 sec = 9908.6369 Mbps     0 retrans
 1181.3125 MB /   1.00 sec = 9909.8747 Mbps     0 retrans
 1181.2500 MB /   1.00 sec = 9909.0531 Mbps     0 retrans
 1181.2500 MB /   1.00 sec = 9908.8153 Mbps     0 retrans
 1181.2500 MB /   1.00 sec = 9909.0729 Mbps     0 retrans

 8192.0000 MB /   7.13 sec = 9633.5814 Mbps 17 %TX 42 %RX 0 retrans 18.91 msRTT

Quickly ramps up to full 10-GigE line rate, with no TCP retrans.

8 GB transfer across an 18 ms RTT path with autotuning and no hystart:

i7test7% nuttcp -n8g -i1 192.168.1.23
  845.4375 MB /   1.00 sec = 7091.5828 Mbps     0 retrans
 1181.3125 MB /   1.00 sec = 9910.0134 Mbps     0 retrans
 1181.0625 MB /   1.00 sec = 9907.1830 Mbps     0 retrans
 1181.4375 MB /   1.00 sec = 9910.8936 Mbps     0 retrans
 1181.1875 MB /   1.00 sec = 9908.1721 Mbps     0 retrans
 1181.3125 MB /   1.00 sec = 9909.5774 Mbps     0 retrans
 1181.1875 MB /   1.00 sec = 9908.6874 Mbps     0 retrans

 8192.0000 MB /   7.25 sec = 9484.4524 Mbps 18 %TX 41 %RX 0 retrans 18.92 msRTT

Also quickly ramps up to full 10-GigE line rate, with no TCP retrans.

8 GB transfer across an 18 ms RTT path with 40 MB socket buffer and no hystart:

i7test7% nuttcp -n8g -w40m -i1 192.168.1.23
  969.8750 MB /   1.00 sec = 8135.6571 Mbps     0 retrans
 1181.3125 MB /   1.00 sec = 9909.3990 Mbps     0 retrans
 1181.2500 MB /   1.00 sec = 9908.9342 Mbps     0 retrans
 1181.2500 MB /   1.00 sec = 9909.4098 Mbps     0 retrans
 1181.2500 MB /   1.00 sec = 9908.8252 Mbps     0 retrans
 1181.2500 MB /   1.00 sec = 9909.0630 Mbps     0 retrans
 1181.2500 MB /   1.00 sec = 9909.3504 Mbps     0 retrans

 8192.0000 MB /   7.15 sec = 9611.8053 Mbps 18 %TX 42 %RX 0 retrans 18.95 msRTT

Basically the same as the case with 40 MB socket buffer and hystart enabled.

Now trying the same type of tests across an 80 ms RTT path.

8 GB transfer across an 80 ms RTT path with autotuning and hystart:

i7test7% nuttcp -n8g -i1 192.168.1.18
   11.3125 MB /   1.00 sec =   94.8954 Mbps     0 retrans
  441.5625 MB /   1.00 sec = 3704.1021 Mbps     0 retrans
  687.3750 MB /   1.00 sec = 5765.8657 Mbps     0 retrans
  715.5625 MB /   1.00 sec = 6002.6273 Mbps     0 retrans
  709.9375 MB /   1.00 sec = 5955.5958 Mbps     0 retrans
  691.3125 MB /   1.00 sec = 5799.0626 Mbps     0 retrans
  718.6250 MB /   1.00 sec = 6028.3538 Mbps     0 retrans
  718.0000 MB /   1.00 sec = 6023.0205 Mbps     0 retrans
  704.0000 MB /   1.00 sec = 5905.5387 Mbps     0 retrans
  733.3125 MB /   1.00 sec = 6151.4096 Mbps     0 retrans
  738.8750 MB /   1.00 sec = 6198.2381 Mbps     0 retrans
  731.8750 MB /   1.00 sec = 6139.3695 Mbps     0 retrans

 8192.0000 MB /  12.85 sec = 5348.9677 Mbps 10 %TX 23 %RX 0 retrans 80.81 msRTT

Similar to the 20 ms RTT path, but achieving somewhat lower
performance levels, presumably due to the larger RTT.  Ramps
up fairly quickly to a little under 6 Gbps, then increases
more slowly to 6+ Gbps, with no TCP retransmissions.

8 GB transfer across an 80 ms RTT path with 100 MB socket buffer and hystart:

i7test7% nuttcp -n8g -w100m -i1 192.168.1.18
  103.9375 MB /   1.00 sec =  871.8378 Mbps     0 retrans
 1086.5625 MB /   1.00 sec = 9114.6102 Mbps     0 retrans
 1106.6875 MB /   1.00 sec = 9283.5583 Mbps     0 retrans
 1109.3125 MB /   1.00 sec = 9305.5226 Mbps     0 retrans
 1111.1875 MB /   1.00 sec = 9321.9596 Mbps     0 retrans
 1112.8125 MB /   1.00 sec = 9334.8452 Mbps     0 retrans
 1113.6875 MB /   1.00 sec = 9341.6620 Mbps     0 retrans
 1120.2500 MB /   1.00 sec = 9398.0054 Mbps     0 retrans

 8192.0000 MB /   8.37 sec = 8207.2049 Mbps 16 %TX 38 %RX 0 retrans 80.81 msRTT

Quickly ramps up to 9+ Gbps and then slowly increases further,
with no TCP retrans.

8 GB transfer across an 80 ms RTT path with autotuning and no hystart:

i7test7% nuttcp -n8g -i1 192.168.1.18
   11.2500 MB /   1.00 sec =   94.3703 Mbps     0 retrans
  519.0625 MB /   1.00 sec = 4354.1596 Mbps     0 retrans
  861.2500 MB /   1.00 sec = 7224.7970 Mbps     0 retrans
  871.0000 MB /   1.00 sec = 7306.4191 Mbps     0 retrans
  860.7500 MB /   1.00 sec = 7220.4438 Mbps     0 retrans
  869.0625 MB /   1.00 sec = 7290.3340 Mbps     0 retrans
  863.4375 MB /   1.00 sec = 7242.7707 Mbps     0 retrans
  860.4375 MB /   1.00 sec = 7218.0606 Mbps     0 retrans
  875.5000 MB /   1.00 sec = 7344.3071 Mbps     0 retrans
  863.1875 MB /   1.00 sec = 7240.8257 Mbps     0 retrans

 8192.0000 MB /  10.98 sec = 6259.4379 Mbps 12 %TX 27 %RX 0 retrans 80.81 msRTT

Ramps up quickly to 7+ Gbps, then appears to stabilize at that
level, with no TCP retransmissions.  Performance is somewhat
better than with autotuning enabled, but less than using a
manually set 100 MB socket buffer.

8 GB transfer across an 80 ms RTT path with 100 MB socket buffer and no hystart:

i7test7% nuttcp -n8g -w100m -i1 192.168.1.18
  102.8750 MB /   1.00 sec =  862.9487 Mbps     0 retrans
  522.8750 MB /   1.00 sec = 4386.2811 Mbps   414 retrans
  881.5625 MB /   1.00 sec = 7394.6534 Mbps     0 retrans
 1164.3125 MB /   1.00 sec = 9766.6682 Mbps     0 retrans
 1170.5625 MB /   1.00 sec = 9819.7042 Mbps     0 retrans
 1166.8125 MB /   1.00 sec = 9788.2067 Mbps     0 retrans
 1159.8750 MB /   1.00 sec = 9729.1530 Mbps     0 retrans
  811.1250 MB /   1.00 sec = 6804.8017 Mbps    21 retrans
   73.2500 MB /   1.00 sec =  614.4674 Mbps     0 retrans
  884.6250 MB /   1.00 sec = 7420.2900 Mbps     0 retrans

 8192.0000 MB /  10.34 sec = 6647.9394 Mbps 13 %TX 31 %RX 435 retrans 80.81 msRTT

Disabling hystart on a large RTT path does not seem to play nice with
a manually specified socket buffer, resulting in TCP retransmissions
that limit the effective network performance.

This is a repeatable but extremely variable phenomenon.

i7test7% nuttcp -n8g -w100m -i1 192.168.1.18
  103.7500 MB /   1.00 sec =  870.3015 Mbps     0 retrans
 1146.3750 MB /   1.00 sec = 9616.4520 Mbps     0 retrans
 1175.9375 MB /   1.00 sec = 9864.6070 Mbps     0 retrans
  615.6875 MB /   1.00 sec = 5164.7353 Mbps    21 retrans
  139.2500 MB /   1.00 sec = 1168.1253 Mbps     0 retrans
 1090.0625 MB /   1.00 sec = 9143.8053 Mbps     0 retrans
 1170.4375 MB /   1.00 sec = 9818.6654 Mbps     0 retrans
 1174.5625 MB /   1.00 sec = 9852.8754 Mbps     0 retrans
 1174.8750 MB /   1.00 sec = 9855.6052 Mbps     0 retrans

 8192.0000 MB /   9.42 sec = 7292.9879 Mbps 14 %TX 34 %RX 21 retrans 80.81 msRTT

And:

i7test7% nuttcp -n8g -w100m -i1 192.168.1.18
  102.8125 MB /   1.00 sec =  862.4227 Mbps     0 retrans
 1148.4375 MB /   1.00 sec = 9633.6860 Mbps     0 retrans
 1177.4375 MB /   1.00 sec = 9877.3086 Mbps     0 retrans
 1168.1250 MB /   1.00 sec = 9798.9133 Mbps    11 retrans
  133.1250 MB /   1.00 sec = 1116.7457 Mbps     0 retrans
  479.8750 MB /   1.00 sec = 4025.4631 Mbps     0 retrans
 1150.6875 MB /   1.00 sec = 9652.4830 Mbps     0 retrans
 1177.3125 MB /   1.00 sec = 9876.0624 Mbps     0 retrans
 1177.3750 MB /   1.00 sec = 9876.0139 Mbps     0 retrans
  320.2500 MB /   1.00 sec = 2686.6452 Mbps    19 retrans
   64.9375 MB /   1.00 sec =  544.7363 Mbps     0 retrans
   73.6250 MB /   1.00 sec =  617.6113 Mbps     0 retrans

 8192.0000 MB /  12.39 sec = 5545.7570 Mbps 12 %TX 26 %RX 30 retrans 80.80 msRTT

Re-enabling hystart immediately gives a clean test with no TCP retrans.

i7test7% nuttcp -n8g -w100m -i1 192.168.1.18
  103.8750 MB /   1.00 sec =  871.3353 Mbps     0 retrans
 1086.7500 MB /   1.00 sec = 9116.4474 Mbps     0 retrans
 1105.8125 MB /   1.00 sec = 9276.2276 Mbps     0 retrans
 1109.4375 MB /   1.00 sec = 9306.5339 Mbps     0 retrans
 1111.3125 MB /   1.00 sec = 9322.5327 Mbps     0 retrans
 1111.3750 MB /   1.00 sec = 9322.8053 Mbps     0 retrans
 1113.7500 MB /   1.00 sec = 9342.8962 Mbps     0 retrans
 1120.3125 MB /   1.00 sec = 9397.5711 Mbps     0 retrans

 8192.0000 MB /   8.38 sec = 8204.8394 Mbps 16 %TX 39 %RX 0 retrans 80.80 msRTT

						-Bill
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ