[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoBCqXiQ33XZv61vCes7_XV83DC0HSPy_w6eCn4pzzJeqA@mail.gmail.com>
Date: Thu, 7 Nov 2024 17:00:30 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Philo Lu <lulie@...ux.alibaba.com>, davem@...emloft.net, kuba@...nel.org,
pabeni@...hat.com, dsahern@...nel.org, horms@...nel.org,
netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next] tcp: avoid RST in 3-way shakehands due to
failure in tcp_timewait_state_process
On Thu, Nov 7, 2024 at 4:37 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Thu, Nov 7, 2024 at 9:27 AM Jason Xing <kerneljasonxing@...il.com> wrote:
> >
> > On Thu, Nov 7, 2024 at 4:22 PM Philo Lu <lulie@...ux.alibaba.com> wrote:
> > >
> > >
> > >
> > > On 2024/11/7 16:01, Jason Xing wrote:
> > > > On Thu, Nov 7, 2024 at 3:51 PM Philo Lu <lulie@...ux.alibaba.com> wrote:
> > > >>
> > > >> Hi Jason,
> > > >>
> > > >> On 2024/11/5 10:55, Jason Xing wrote:
> > > >>> From: Jason Xing <kernelxing@...cent.com>
> > > >>>
> > > >>> We found there are rare chances that some RST packets appear during
> > > >>> the shakehands because the timewait socket cannot accept the SYN and
> > > >>> doesn't return TCP_TW_SYN in tcp_timewait_state_process().
> > > >>>
> > > >>> Here is how things happen in production:
> > > >>> Time Client(A) Server(B)
> > > >>> 0s SYN-->
> > > >>> ...
> > > >>> 132s <-- FIN
> > > >>> ...
> > > >>> 169s FIN-->
> > > >>> 169s <-- ACK
> > > >>> 169s SYN-->
> > > >>> 169s <-- ACK
> > > >>> 169s RST-->
> > > >>> As above picture shows, the two flows have a start time difference
> > > >>> of 169 seconds. B starts to send FIN so it will finally enter into
> > > >>> TIMEWAIT state. Nearly at the same time A launches a new connection
> > > >>> that soon is reset by itself due to receiving a ACK.
> > > >>>
> > > >>> There are two key checks in tcp_timewait_state_process() when timewait
> > > >>> socket in B receives the SYN packet:
> > > >>> 1) after(TCP_SKB_CB(skb)->seq, rcv_nxt)
> > > >>> 2) (s32)(READ_ONCE(tcptw->tw_ts_recent) - tmp_opt.rcv_tsval) < 0)
> > > >>>
> > > >>> Regarding the first rule, it fails as expected because in the first
> > > >>> connection the seq of SYN sent from A is 1892994276, then 169s have
> > > >>> passed, the second SYN has 239034613 (caused by overflow of s32).
> > > >>>
> > > >>> Then how about the second rule?
> > > >>> It fails again!
> > > >>> Let's take a look at how the tsval comes out:
> > > >>> __tcp_transmit_skb()
> > > >>> -> tcp_syn_options()
> > > >>> -> opts->tsval = tcp_skb_timestamp_ts(tp->tcp_usec_ts, skb) + tp->tsoffset;
> > > >>> The timestamp depends on two things, one is skb->skb_mstamp_ns, the
> > > >>> other is tp->tsoffset. The latter value is fixed, so we don't need
> > > >>> to care about it. If both operations (sending FIN and then starting
> > > >>> sending SYN) from A happen in 1ms, then the tsval would be the same.
> > > >>> It can be clearly seen in the tcpdump log. Notice that the tsval is
> > > >>> with millisecond precision.
> > > >>>
> > > >>> Based on the above analysis, I decided to make a small change to
> > > >>> the check in tcp_timewait_state_process() so that the second flow
> > > >>> would not fail.
> > > >>>
> > > >>
> > > >> I wonder what a bad result the RST causes. As far as I know, the client
> > > >> will not close the connect and return. Instead, it re-sends an SYN in
> > > >> TCP_TIMEOUT_MIN(2) jiffies (implemented in
> > > >> tcp_rcv_synsent_state_process). So the second connection could still be
> > > >> established successfully, at the cost of a bit more delay. Like:
> > > >>
> > > >> Time Client(A) Server(B)
> > > >> 0s SYN-->
> > > >> ...
> > > >> 132s <-- FIN
> > > >> ...
> > > >> 169s FIN-->
> > > >> 169s <-- ACK
> > > >> 169s SYN-->
> > > >> 169s <-- ACK
> > > >> 169s RST-->
> > > >> ~2jiffies SYN-->
It doesn't happen. I dare to say the SYN will not be retransmitted
soon which I can tell from many tcpdump logs I captured.
> > > >> <-- SYN,ACK
> > > >
> > > > That's exactly what I meant here :) Originally I didn't expect the
> > > > application to relaunch a connection in this case.
> > >
> > > s/application/kernel/, right?
> >
> > No. Perhaps I didn't make myself clear. If the kernel doesn't silently
> > drop the SYN and then send back an ACK, the application has to call
> > the connect() syscall again.
>
> My suggestion to stop the confusion:
Oh, right now, I realized that Philo and I are not on the same page :(
Please forget that conversation.
My points are:
1) If B silently drops the SYN packet, A will retransmit an SYN packet
and then the connection will be established. It's what I tried to
propose and would like to see. It also adheres to the RFC 6191.
2) As kuniyuki pointed out on the contrary, sending an ACK (like the
current implementation) instead of silently dropping the SYN packet is
actually a challenge ack. If so, I think we need to consider this ACK
as a challenge ack like what tcp_send_challenge_ack() does.
I'd like to hear more about your opinions on the above conclusion.
Thanks!
Powered by blists - more mailing lists