[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20180613165543.0F92DA09E2@unicorn.suse.cz>
Date: Wed, 13 Jun 2018 18:55:43 +0200 (CEST)
From: Michal Kubecek <mkubecek@...e.cz>
To: netdev@...r.kernel.org
Cc: Eric Dumazet <edumazet@...gle.com>,
Yuchung Cheng <ycheng@...gle.com>,
Ilpo Jarvinen <ilpo.jarvinen@...sinki.fi>,
linux-kernel@...r.kernel.org
Subject: [RFC PATCH RESEND] tcp: avoid F-RTO if SACK and timestamps are disabled
When F-RTO algorithm (RFC 5682) is used on connection without both SACK and
timestamps (either because of (mis)configuration or because the other
endpoint does not advertise them), specific pattern loss can make RTO grow
exponentially until the sender is only able to send one packet per two
minutes (TCP_RTO_MAX).
One way to reproduce is to
- make sure the connection uses neither SACK nor timestamps
- let tp->reorder grow enough so that lost packets are retransmitted
after RTO (rather than when high_seq - snd_una > reorder * MSS)
- let the data flow stabilize
- drop multiple sender packets in "every second" pattern
- either there is no new data to send or acks received in response to new
data are also window updates (i.e. not dupacks by definition)
In this scenario, the sender keeps cycling between retransmitting first
lost packet (step 1 of RFC 5682), sending new data by (2b) and timing out
again. In this loop, the sender only gets
(a) acks for retransmitted segments (possibly together with old ones)
(b) window updates
Without timestamps, neither can be used for RTT estimator and without SACK,
we have no newly sacked segments to estimate RTT either. Therefore each
timeout doubles RTO and without usable RTT samples so that there is nothing
to counter the exponential growth.
While disabling both SACK and timestamps doesn't make any sense, the
resulting behaviour is so pathological that it deserves an improvement.
(Also, both can be disabled on the other side.) Avoid F-RTO algorithm in
case both SACK and timestamps are disabled so that the sender falls back to
traditional slow start retransmission.
Signed-off-by: Michal Kubecek <mkubecek@...e.cz>
---
net/ipv4/tcp_input.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 355d3dffd021..ed603f987b72 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2001,7 +2001,8 @@ void tcp_enter_loss(struct sock *sk)
*/
tp->frto = net->ipv4.sysctl_tcp_frto &&
(new_recovery || icsk->icsk_retransmits) &&
- !inet_csk(sk)->icsk_mtup.probe_size;
+ !inet_csk(sk)->icsk_mtup.probe_size &&
+ (tcp_is_sack(tp) || tp->rx_opt.tstamp_ok);
}
/* If ACK arrived pointing to a remembered SACK, it means that our
--
2.17.1
Powered by blists - more mailing lists