[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240718-upstream-net-next-20240716-tcp-3rd-ack-consume-sk_socket-v2-1-d653f85639f6@kernel.org>
Date: Thu, 18 Jul 2024 12:33:56 +0200
From: "Matthieu Baerts (NGI0)" <matttbe@...nel.org>
To: Eric Dumazet <edumazet@...gle.com>,
"David S. Miller" <davem@...emloft.net>, David Ahern <dsahern@...nel.org>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Kuniyuki Iwashima <kuniyu@...zon.com>, Jerry Chu <hkchu@...gle.com>
Cc: netdev@...r.kernel.org, mptcp@...ts.linux.dev,
linux-kernel@...r.kernel.org, "Matthieu Baerts (NGI0)" <matttbe@...nel.org>
Subject: [PATCH net v2 1/2] tcp: process the 3rd ACK with sk_socket for
TFO/MPTCP
The 'Fixes' commit recently changed the behaviour of TCP by skipping the
processing of the 3rd ACK when a sk->sk_socket is set. The goal was to
skip tcp_ack_snd_check() in tcp_rcv_state_process() not to send an
unnecessary ACK in case of simultaneous connect(). Unfortunately, that
had an impact on TFO and MPTCP.
I started to look at the impact on MPTCP, because the MPTCP CI found
some issues with the MPTCP Packetdrill tests [1]. Then Paolo suggested
me to look at the impact on TFO with "plain" TCP.
For MPTCP, when receiving the 3rd ACK of a request adding a new path
(MP_JOIN), sk->sk_socket will be set, and point to the MPTCP sock that
has been created when the MPTCP connection got established before with
the first path. The newly added 'goto' will then skip the processing of
the segment text (step 7) and not go through tcp_data_queue() where the
MPTCP options are validated, and some actions are triggered, e.g.
sending the MPJ 4th ACK [2] as demonstrated by the new errors when
running a packetdrill test [3] establishing a second subflow.
This doesn't fully break MPTCP, mainly the 4th MPJ ACK that will be
delayed. Still, we don't want to have this behaviour as it delays the
switch to the fully established mode, and invalid MPTCP options in this
3rd ACK will not be caught any more. This modification also affects the
MPTCP + TFO feature as well, and being the reason why the selftests
started to be unstable the last few days [4].
For TFO, the existing 'basic-cookie-not-reqd' test [5] was no longer
passing: if the 3rd ACK contains data, and the connection is accept()ed
before receiving them, these data would no longer be processed, and thus
not ACKed.
One last thing about MPTCP, in case of simultaneous connect(), a
fallback to TCP will be done, which seems fine:
`../common/defaults.sh`
0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_MPTCP) = 3
+0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
+0 > S 0:0(0) <mss 1460, sackOK, TS val 100 ecr 0, nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
+0 < S 0:0(0) win 1000 <mss 1460, sackOK, TS val 407 ecr 0, nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
+0 > S. 0:0(0) ack 1 <mss 1460, sackOK, TS val 330 ecr 0, nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
+0 < S. 0:0(0) ack 1 win 65535 <mss 1460, sackOK, TS val 700 ecr 100, nop, wscale 8, mpcapable v1 flags[flag_h] key[skey=2]>
+0 write(3, ..., 100) = 100
+0 > . 1:1(0) ack 1 <nop, nop, TS val 845707014 ecr 700, nop, nop, sack 0:1>
+0 > P. 1:101(100) ack 1 <nop, nop, TS val 845958933 ecr 700>
Simultaneous SYN-data crossing is also not supported by TFO, see [6].
Link: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/9936227696 [1]
Link: https://datatracker.ietf.org/doc/html/rfc8684#fig_tokens [2]
Link: https://github.com/multipath-tcp/packetdrill/blob/mptcp-net-next/gtests/net/mptcp/syscalls/accept.pkt#L28 [3]
Link: https://netdev.bots.linux.dev/contest.html?executor=vmksft-mptcp-dbg&test=mptcp-connect-sh [4]
Link: https://github.com/google/packetdrill/blob/master/gtests/net/tcp/fastopen/server/basic-cookie-not-reqd.pkt#L21 [5]
Link: https://github.com/google/packetdrill/blob/master/gtests/net/tcp/fastopen/client/simultaneous-fast-open.pkt [6]
Fixes: 23e89e8ee7be ("tcp: Don't drop SYN+ACK for simultaneous connect().")
Suggested-by: Paolo Abeni <pabeni@...hat.com>
Suggested-by: Kuniyuki Iwashima <kuniyu@...zon.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@...nel.org>
---
Notes:
- We could also drop this 'goto consume', and send the unnecessary ACK
in this simultaneous connect case, which doesn't seem to be a "real"
case, more something for fuzzers. But that's not what the RFC 9293
recommends to do.
- v2:
- Check if the SYN bit is set instead of looking for TFO and MPTCP
specific attributes, as suggested by Kuniyuki.
- Updated the comment above
- Please note that the v2 has been sent mainly to satisfy the CI (to
be able to catch new bugs with MPTCP), and because the suggestion
from Kuniyuki looks better. It has not been sent to urge TCP
maintainers to review it quicker than it should, please take your
time and enjoy netdev.conf :)
---
net/ipv4/tcp_input.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index ff9ab3d01ced..bfe1bc69dc3e 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6820,7 +6820,12 @@ tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
if (sk->sk_shutdown & SEND_SHUTDOWN)
tcp_shutdown(sk, SEND_SHUTDOWN);
- if (sk->sk_socket)
+ /* For crossed SYN cases, not to send an unnecessary ACK.
+ * Note that sk->sk_socket can be assigned in other cases, e.g.
+ * with TFO (if accept()'ed before the 3rd ACK) and MPTCP (MPJ:
+ * sk_socket is the parent MPTCP sock).
+ */
+ if (sk->sk_socket && th->syn)
goto consume;
break;
--
2.45.2
Powered by blists - more mailing lists