[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4bf331f1-123a-4290-868f-798c12a1f3f4@kernel.org>
Date: Tue, 25 Feb 2025 11:01:03 +0100
From: Matthieu Baerts <matttbe@...nel.org>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Kuniyuki Iwashima <kuniyu@...zon.com>, Simon Horman <horms@...nel.org>,
Florian Westphal <fw@...len.de>, netdev@...r.kernel.org,
eric.dumazet@...il.com, Yong-Hao Zou <yonghaoz1994@...il.com>,
"David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Neal Cardwell <ncardwell@...gle.com>
Subject: Re: [PATCH net-next] tcp: be less liberal in tsecr received while in
SYN_RECV state
Hi Eric,
On 24/02/2025 12:06, Eric Dumazet wrote:
> Yong-Hao Zou mentioned that linux was not strict as other OS in 3WHS,
> for flows using TCP TS option (RFC 7323)
>
> As hinted by an old comment in tcp_check_req(),
> we can check the TSecr value in the incoming packet corresponds
> to one of the SYNACK TSval values we have sent.
>
> In this patch, I record the oldest and most recent values
> that SYNACK packets have used.
>
> Send a challenge ACK if we receive a TSecr outside
> of this range, and increase a new SNMP counter.
Thank you for this patch!
Sadly, it looks like it breaks MPTCP selftests, see [1] and [2]. When
there is a failure, we can see that the new counter is incremented [3]:
> # selftests: net/mptcp: mptcp_join.sh
(...)
> # 045 add multiple subflows IPv6
> # currently established: 1 [ OK ]
> # ack rx [FAIL] got 1 JOIN[s] ack rx expected 2
> # Server ns stats
> # TcpPassiveOpens 2 0.0
> # TcpAttemptFails 1 0.0
> # TcpInSegs 51 0.0
> # TcpOutSegs 59 0.0
> # TcpRetransSegs 3 0.0
> # TcpExtEmbryonicRsts 1 0.0
> # TcpExtTW 1 0.0
> # TcpExtTSECR_Rejected 3 0.0
> # TcpExtDelayedACKs 7 0.0
> # TcpExtTCPPureAcks 19 0.0
> # TcpExtTCPTimeouts 2 0.0
> # TcpExtTCPSynRetrans 3 0.0
> # TcpExtTCPOrigDataSent 24 0.0
> # TcpExtTCPACKSkippedSynRecv 2 0.0
> # TcpExtTCPDelivered 24 0.0
> # MPTcpExtMPCapableSYNRX 1 0.0
> # MPTcpExtMPCapableACKRX 1 0.0
> # MPTcpExtMPJoinSynRx 2 0.0
> # MPTcpExtMPJoinAckRx 1 0.0
> # Client ns stats
> # TcpActiveOpens 3 0.0
> # TcpEstabResets 1 0.0
> # TcpInSegs 59 0.0
> # TcpOutSegs 50 0.0
> # TcpRetransSegs 1 0.0
> # TcpInErrs 3 0.0
> # TcpOutRsts 1 0.0
> # TcpExtTW 2 0.0
> # TcpExtDelayedACKs 1 0.0
> # TcpExtTCPPureAcks 29 0.0
> # TcpExtTCPTimeouts 1 0.0
> # TcpExtTCPChallengeACK 2 0.0
> # TcpExtTCPSYNChallenge 3 0.0
> # TcpExtTCPSynRetrans 1 0.0
> # TcpExtTCPOrigDataSent 24 0.0
> # TcpExtTCPACKSkippedChallenge 1 0.0
> # TcpExtTCPDelivered 27 0.0
> # TcpExtTcpTimeoutRehash 1 0.0
> # MPTcpExtMPCapableSYNTX 1 0.0
> # MPTcpExtMPCapableSYNACKRX 1 0.0
> # MPTcpExtMPJoinSynAckRx 2 0.0
> # MPTcpExtMPJoinSynTx 2 0.0
> # MPTcpExtMPRstTx 1 0.0
> # MPTcpExtRcvWndShared 2 0.0
> # join Rx [FAIL] see above
> # join Tx [ OK ]
> # currently established: 0 [ OK ]
(...)
> # 064 simult IPv4 and IPv6 subflows, fullmesh 2x2
> # ack rx [FAIL] got 2 JOIN[s] ack rx expected 4
> # Server ns stats
> # TcpPassiveOpens 3 0.0
> # TcpAttemptFails 2 0.0
> # TcpInSegs 77 0.0
> # TcpOutSegs 74 0.0
> # TcpRetransSegs 6 0.0
> # TcpExtEmbryonicRsts 2 0.0
> # TcpExtTW 3 0.0
> # TcpExtTSECR_Rejected 6 0.0
> # TcpExtDelayedACKs 8 0.0
> # TcpExtTCPPureAcks 36 0.0
> # TcpExtTCPTimeouts 4 0.0
> # TcpExtTCPSynRetrans 6 0.0
> # TcpExtTCPOrigDataSent 25 0.0
> # TcpExtTCPACKSkippedSynRecv 4 0.0
> # TcpExtTCPDelivered 25 0.0
> # MPTcpExtMPCapableSYNRX 1 0.0
> # MPTcpExtMPCapableACKRX 1 0.0
> # MPTcpExtMPJoinSynRx 4 0.0
> # MPTcpExtMPJoinAckRx 2 0.0
> # MPTcpExtDuplicateData 1 0.0
> # MPTcpExtAddAddrTx 2 0.0
> # MPTcpExtEchoAdd 2 0.0
> # MPTcpExtRcvWndShared 2 0.0
> # Client ns stats
> # TcpActiveOpens 5 0.0
> # TcpEstabResets 2 0.0
> # TcpInSegs 74 0.0
> # TcpOutSegs 75 0.0
> # TcpRetransSegs 2 0.0
> # TcpInErrs 6 0.0
> # TcpOutRsts 2 0.0
> # TcpExtTW 3 0.0
> # TcpExtDelayedACKs 7 0.0
> # TcpExtTCPPureAcks 38 0.0
> # TcpExtTCPTimeouts 2 0.0
> # TcpExtTCPChallengeACK 4 0.0
> # TcpExtTCPSYNChallenge 6 0.0
> # TcpExtTCPSynRetrans 2 0.0
> # TcpExtTCPOrigDataSent 26 0.0
> # TcpExtTCPACKSkippedChallenge 2 0.0
> # TcpExtTCPDelivered 31 0.0
> # TcpExtTcpTimeoutRehash 2 0.0
> # MPTcpExtMPCapableSYNTX 1 0.0
> # MPTcpExtMPCapableSYNACKRX 1 0.0
> # MPTcpExtMPTCPRetrans 1 0.0
> # MPTcpExtMPJoinSynAckRx 4 0.0
> # MPTcpExtMPJoinSynTx 4 0.0
> # MPTcpExtAddAddr 2 0.0
> # MPTcpExtEchoAddTx 2 0.0
> # MPTcpExtMPRstTx 2 0.0
> # MPTcpExtRcvWndShared 4 0.0
> # join Rx [FAIL] see above
> # join Tx [ OK ]
This is easy to reproduce apparently with a "non-debug" kernel:
$ ./mptcp_join.sh "add multiple subflows IPv6"
$ ./mptcp_join.sh "simult IPv4 and IPv6 subflows, fullmesh 2x2"
I didn't check yet, but I prefer to already send this email to delay
this patch if that's OK. Maybe you already have an idea on what is
wrong? Maybe something checked in tcp_check_req() and not initialised on
MPTCP side?
[1] https://netdev.bots.linux.dev/flakes.html?tn-needle=mptcp
[2]
https://netdev.bots.linux.dev/contest.html?executor=vmksft-mptcp&ld-cases=1&pass=0&skip=0
[3]
https://netdev-3.bots.linux.dev/vmksft-mptcp/results/6642/1-mptcp-join-sh/stdout
> nstat -az | grep TcpExtTSECR_Rejected
> TcpExtTSECR_Rejected 0 0.0
It looks strange to have the underscore in the name. Maybe better
without it?
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
Powered by blists - more mailing lists