lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110309212839.GA1367@xanadu.blop.info>
Date:	Wed, 9 Mar 2011 22:28:39 +0100
From:	Lucas Nussbaum <lucas.nussbaum@...ia.fr>
To:	Stephen Hemminger <shemminger@...tta.com>
Cc:	Injong Rhee <rhee@...u.edu>, David Miller <davem@...emloft.net>,
	xiyou.wangcong@...il.com, netdev@...r.kernel.org,
	sangtae.ha@...il.com
Subject: Re: [PATCH] Make CUBIC Hystart more robust to RTT variations

On 09/03/11 at 11:56 -0800, Stephen Hemminger wrote:
> On Wed, 9 Mar 2011 19:25:05 +0100
> Lucas Nussbaum <lucas.nussbaum@...ia.fr> wrote:
> 
> > On 09/03/11 at 09:56 -0800, Stephen Hemminger wrote:
> > > On Wed, 9 Mar 2011 07:53:19 +0100
> > > Lucas Nussbaum <lucas.nussbaum@...ia.fr> wrote:
> > > 
> > > > On 08/03/11 at 20:30 -0500, Injong Rhee wrote:
> > > > > Now, both tools can be wrong. But that is not catastrophic since
> > > > > congestion avoidance can kick in to save the day. In a pipe where no
> > > > > other flows are competing, then exiting slow start too early can
> > > > > slow things down as the window can be still too small. But that is
> > > > > in fact when delays are most reliable. So those tests that say bad
> > > > > performance with hystart are in fact, where hystart is supposed to
> > > > > perform well.
> > > > 
> > > > Hi,
> > > > 
> > > > In my setup, there is no congestion at all (except the buffer bloat).
> > > > Without Hystart, transferring 8 Gb of data takes 9s, with CUBIC exiting
> > > > slow start at ~2000 packets.
> > > > With Hystart, transferring 8 Gb of data takes 19s, with CUBIC exiting
> > > > slow start at ~20 packets.
> > > > I don't think that this is "hystart performing well". We could just as
> > > > well remove slow start completely, and only do congestion avoidance,
> > > > then.
> > > > 
> > > > While I see the value in Hystart, it's clear that there are some flaws
> > > > in the current implementation. It probably makes sense to disable
> > > > hystart by default until those problems are fixed.
> > > 
> > > What is the speed and RTT time of your network?
> > > I think you maybe blaming hystart for other issues in the network.
> > 
> > What kind of issues?
> > 
> > Host1 is connected through a gigabit ethernet LAN to Router1
> > Host2 is connected through a gigabit ethernet LAN to Router2
> > Router1 and Router2 are connected through an experimentation network at
> > 10 Gb/s
> > RTT between Host1 and Host2 is 11.3ms.
> > The network is not congested.
> > 
> > (I can provide access to the testbed if someone wants to do further
> > testing)
> 
> Your backbone is faster than the LAN, interesting.
> Could you check packet stats to see where packet drop is occuring?
> It could be that routers don't have enough buffering to take packet
> trains from 10G network and pace them out to 1G network.

I don't have access to the routers to check the packet counts here.
However, according to "netstat -s" on the sender(s), no retransmissions
are occuring, whether hystart is enabled or not: the host can just send
data at the network rate without experiencing congestion anywhere. Also,
it is unlikely that transient congestion in the backbone is an issue
according to the monitoring tools I have access to.

(Replying to your other mail as well)
> By my calculations (1G * 11.3ms) gives BDP of 941 packets which means
> CUBIC would ideally exit slow start at 900 or so packets. Old CUBIC
> slowstrart of 2000 packets means there is huge overshoot which means
> large packet loss burst which would cause a large CPU load on receiver
> processing SACK.

Since the network capacity is higher or equal to the network capacity on
the host, there's no reason why losses would occur if there's no
congestion caused by other traffic, right?

> I assume you haven't done anything that would disable RFC1323
> support like turn off window scaling or tcp timestamps.

No, nothing strange that could cause different results.

I've tried to exclude hardware problems by using different parts of the
testbed (see map at
https://www.grid5000.fr/mediawiki/images/Renater5-g5k.jpg).  I used
machines in rennes, lille, lyon and grenoble today (using different
hardware). My original testing was done between rennes and nancy. The
same symptoms appear everywhere, in both directions, and disappear when
disabling hystart.
-- 
| Lucas Nussbaum             MCF Université Nancy 2 |
| lucas.nussbaum@...ia.fr         LORIA / AlGorille |
| http://www.loria.fr/~lnussbau/  +33 3 54 95 86 19 |
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ