netdev - [PATCH v2 net 1/5] tcp: feed correct number of pkts acked to cc modules also in recovery

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1520604632-21185-2-git-send-email-ilpo.jarvinen@helsinki.fi>
Date:   Fri,  9 Mar 2018 16:10:28 +0200
From:   Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
To:     netdev@...r.kernel.org
Cc:     Yuchung Cheng <ycheng@...gle.com>,
        Neal Cardwell <ncardwell@...gle.com>,
        Eric Dumazet <edumazet@...gle.com>
Subject: [PATCH v2 net 1/5] tcp: feed correct number of pkts acked to cc modules also in recovery

A miscalculation for the number of acknowledged packets occurs during
RTO recovery whenever SACK is not enabled and a cumulative ACK covers
any non-retransmitted skbs. The reason is that pkts_acked value
calculated in tcp_clean_rtx_queue is not correct for slow start after
RTO as it may include segments that were not lost and therefore did
not need retransmissions in the slow start following the RTO. Then
tcp_slow_start will add the excess into cwnd bloating it and
triggering a burst.

Instead, we want to pass only the number of retransmitted segments
that were covered by the cumulative ACK (and potentially newly sent
data segments too if the cumulative ACK covers that far).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
---
 net/ipv4/tcp_input.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 9a1b3c1..0305f6d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3027,6 +3027,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack,
 	long seq_rtt_us = -1L;
 	long ca_rtt_us = -1L;
 	u32 pkts_acked = 0;
+	u32 rexmit_acked = 0;
+	u32 newdata_acked = 0;
 	u32 last_in_flight = 0;
 	bool rtt_update;
 	int flag = 0;
@@ -3056,8 +3058,10 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack,
 		}
 
 		if (unlikely(sacked & TCPCB_RETRANS)) {
-			if (sacked & TCPCB_SACKED_RETRANS)
+			if (sacked & TCPCB_SACKED_RETRANS) {
 				tp->retrans_out -= acked_pcount;
+				rexmit_acked += acked_pcount;
+			}
 			flag |= FLAG_RETRANS_DATA_ACKED;
 		} else if (!(sacked & TCPCB_SACKED_ACKED)) {
 			last_ackt = skb->skb_mstamp;
@@ -3068,8 +3072,11 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack,
 			last_in_flight = TCP_SKB_CB(skb)->tx.in_flight;
 			if (before(start_seq, reord))
 				reord = start_seq;
-			if (!after(scb->end_seq, tp->high_seq))
+			if (!after(scb->end_seq, tp->high_seq)) {
 				flag |= FLAG_ORIG_SACK_ACKED;
+			} else {
+				newdata_acked += acked_pcount;
+			}
 		}
 
 		if (sacked & TCPCB_SACKED_ACKED) {
@@ -3151,6 +3158,14 @@ static int tcp_clean_rtx_queue(struct sock *sk, u32 prior_fack,
 		}
 
 		if (tcp_is_reno(tp)) {
+			/* Due to discontinuity on RTO in the artificial
+			 * sacked_out calculations, TCP must restrict
+			 * pkts_acked without SACK to rexmits and new data
+			 * segments
+			 */
+			if (icsk->icsk_ca_state == TCP_CA_Loss)
+				pkts_acked = rexmit_acked + newdata_acked;
+
 			tcp_remove_reno_sacks(sk, pkts_acked);
 		} else {
 			int delta;
-- 
2.7.4