lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 29 Apr 2019 15:46:16 -0700
From:   Yuchung Cheng <ycheng@...gle.com>
To:     davem@...emloft.net, edumazet@...gle.com
Cc:     netdev@...r.kernel.org, ncardwell@...gle.com, soheil@...gle.com,
        Yuchung Cheng <ycheng@...gle.com>
Subject: [PATCH net-next 4/8] tcp: undo init congestion window on false SYNACK timeout

Linux implements RFC6298 and use an initial congestion window
of 1 upon establishing the connection if the SYNACK packet is
retransmitted 2 or more times. In cellular networks SYNACK timeouts
are often spurious if the wireless radio was dormant or idle. Also
some network path is longer than the default SYNACK timeout. In
both cases falsely starting with a minimal cwnd are detrimental
to performance.

This patch avoids doing so when the final ACK's TCP timestamp
indicates the original SYNACK was delivered. It remembers the
original SYNACK timestamp when SYNACK timeout has occurred and
re-uses the function to detect spurious SYN timeout conveniently.

Note that a server may receives multiple SYNs from and immediately
retransmits SYNACKs without any SYNACK timeout. This often happens
on when the client SYNs have timed out due to wireless delay
above. In this case since the server will still use the default
initial congestion (e.g. 10) because tp->undo_marker is reset in
tcp_init_metrics(). This is an intentional design because packets
are not lost but delayed.

This patch only covers regular TCP passive open. Fast Open is
supported in the next patch.

Signed-off-by: Yuchung Cheng <ycheng@...gle.com>
Signed-off-by: Neal Cardwell <ncardwell@...gle.com>
Signed-off-by: Eric Dumazet <edumazet@...gle.com>
---
 net/ipv4/tcp_input.c     | 2 ++
 net/ipv4/tcp_minisocks.c | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 30c6a42b1f5b..53b4c5a3113b 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6101,6 +6101,8 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
 			 */
 			tcp_rearm_rto(sk);
 		} else {
+			tcp_try_undo_spurious_syn(sk);
+			tp->retrans_stamp = 0;
 			tcp_init_transfer(sk, BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB);
 			tp->copied_seq = tp->rcv_nxt;
 		}
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 79900f783e0d..9c2a0d36fb20 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -522,6 +522,11 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
 		newtp->rx_opt.ts_recent_stamp = 0;
 		newtp->tcp_header_len = sizeof(struct tcphdr);
 	}
+	if (req->num_timeout) {
+		newtp->undo_marker = treq->snt_isn;
+		newtp->retrans_stamp = div_u64(treq->snt_synack,
+					       USEC_PER_SEC / TCP_TS_HZ);
+	}
 	newtp->tsoffset = treq->ts_off;
 #ifdef CONFIG_TCP_MD5SIG
 	newtp->md5sig_info = NULL;	/*XXX*/
-- 
2.21.0.593.g511ec345e18-goog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ