[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2421.1403051930@localhost.localdomain>
Date: Tue, 17 Jun 2014 17:38:50 -0700
From: Jay Vosburgh <jay.vosburgh@...onical.com>
To: Michal Kubecek <mkubecek@...e.cz>
cc: Yuchung Cheng <ycheng@...gle.com>,
Neal Cardwell <ncardwell@...gle.com>,
"David S. Miller" <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>,
Alexey Kuznetsov <kuznet@....inr.ac.ru>,
James Morris <jmorris@...ei.org>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
Patrick McHardy <kaber@...sh.net>
Subject: Re: [PATCH net] tcp: avoid multiple ssthresh reductions in on retransmit window
Michal Kubecek <mkubecek@...e.cz> wrote:
>On Tue, Jun 17, 2014 at 02:35:23PM -0700, Yuchung Cheng wrote:
>> On Tue, Jun 17, 2014 at 5:20 AM, Michal Kubecek <mkubecek@...e.cz> wrote:
>> > On Mon, Jun 16, 2014 at 08:44:04PM -0400, Neal Cardwell wrote:
>> >> On Mon, Jun 16, 2014 at 8:25 PM, Yuchung Cheng <ycheng@...gle.com> wrote:
>> >> > However Linux is inconsistent on the loss of a retransmission. It
>> >> > reduces ssthresh (and cwnd) if this happens on a timeout, but not in
>> >> > fast recovery (tcp_mark_lost_retrans). We should fix that and that
>> >> > should help dealing with traffic policers.
>> >>
>> >> Yes, great point!
>> >
>> > Does it mean the patch itself would be acceptable if the reasoning in
>> > its commit message was changed? Or would you prefer a different way to
>> > unify the two situations?
>>
>> It's the latter but it seems to belong to a different patch (and it'll
>> not solve the problem you are seeing).
>
>OK, thank you. I guess we will have to persuade them to move to cubic
>which handles their problems much better.
>
>> The idea behind the RFC is that TCP should reduce cwnd and ssthresh
>> across round trips of send, but not within an RTT. Suppose cwnd was
>> 10 on first timeout, so cwnd becomes 1 and ssthresh is 5. Then after 3
>> round trips, we time out again. By the design of Reno this should
>> reset cwnd from 8 to 1, and ssthresh from 5 to 2.5.
>
>Shouldn't that be from 5 to 4? We reduce ssthresh to half of current
>cwnd, not current ssthresh.
>
>BtW, this is exactly the problem our customer is facing: they have
>relatively fast line (15 Mb/s) but with big buffers so that the
>roundtrip times can rise from unloaded 35 ms up to something like 1.5 s
>under full load.
>
>What happens is this: cwnd initally rises to ~2100 then first drops
>are encountered, cwnd is set to 1 and ssthresh to ~1050. The slow start
>lets cwnd reach ssthresh but after that, a slow linear growth follows.
>In this state, all in-flight packets are dropped (simulation of what
>happens on router switchover) so that cwnd is reset to 1 again and
>ssthresh to something like 530-550 (cwnd was a bit higher than ssthresh).
>If a packet loss comes shortly after that, cwnd is still very low and
>ssthresh is reduced to half of that cwnd (i.e. much lower than to half
>of ssthresh). If unlucky, one can even end up with ssthresh reduced to 2
>which takes really long to recover from.
I'm also looking into a problem that exhibits very similar TCP
characteristics, even down to cwnd and ssthresh values similar to what
you cite. In this case, the situation has to do with high RTT (around
80 ms) connections competing with low RTT (1 ms) connections. This case
is already using cubic.
Essentially, a high RTT connection to the server transfers data
in at a reasonable and steady rate until something causes some packets
to be lost (in this case, another transfer from a low RTT host to the
same server). Some packets are lost, and cwnd drops from ~2200 to ~300
(in stages, first to ~1500, then ~600, then to ~300, ). The ssthresh
starts at around 1100, then drops to ~260, which is the lowest cwnd
value.
The recovery from the low cwnd situation is very slow; cwnd
climbs a bit and then remains essentially flat for around 5 seconds. It
then begins to climb until a few packets are lost again, and the cycle
repeats. If no futher losses occur (if the competing traffic has
ceased, for example), recovery from a low cwnd (300 - 750 ish) to the
full value (~2200) requires on the order of 20 seconds. The connection
exits recovery state fairly quickly, and most of the 20 seconds is spent
in open state.
-J
---
-Jay Vosburgh, jay.vosburgh@...onical.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists