[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iJCYaLuHHt4umWzUHCXHLHtdwThz3uuQ9vFbG9_=eibdg@mail.gmail.com>
Date: Thu, 2 Jan 2025 09:16:53 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: "Dujeong.lee" <dujeong.lee@...sung.com>
Cc: Youngmin Nam <youngmin.nam@...sung.com>, Jakub Kicinski <kuba@...nel.org>,
Neal Cardwell <ncardwell@...gle.com>, davem@...emloft.net, dsahern@...nel.org,
pabeni@...hat.com, horms@...nel.org, guo88.liu@...sung.com,
yiwang.cai@...sung.com, netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
joonki.min@...sung.com, hajun.sung@...sung.com, d7271.choe@...sung.com,
sw.ju@...sung.com, iamyunsu.kim@...sung.com, kw0619.kim@...sung.com,
hsl.lim@...sung.com, hanbum22.lee@...sung.com, chaemoo.lim@...sung.com,
seungjin1.yu@...sung.com
Subject: Re: [PATCH] tcp: check socket state before calling WARN_ON
On Thu, Jan 2, 2025 at 1:22 AM Dujeong.lee <dujeong.lee@...sung.com> wrote:
>
> On Mon, Dec 30, 2024 at 6:34 PM Eric Dumazet <edumazet@...gle.com>
> wrote:
> >
> > On Mon, Dec 30, 2024 at 1:24 AM Dujeong.lee <dujeong.lee@...sung.com>
> > wrote:
> > >
> > > On Wed, Dec 18, 2024 7:28 PM Eric Dumazet <edumazet@...gle.com> wrote:
> > >
> > > > On Wed, Dec 18, 2024 at 11:18 AM Dujeong.lee
> > > > <dujeong.lee@...sung.com>
> > > > wrote:
> > > > >
> > > > > Tue, December 10, 2024 at 4:10 PM Dujeong Lee wrote:
> > > > > > On Tue, Dec 10, 2024 at 12:39 PM Dujeong Lee wrote:
> > > > > > > On Mon, Dec 9, 2024 at 7:21 PM Eric Dumazet
> > > > > > > <edumazet@...gle.com>
> > > > wrote:
> > > > > > > > On Mon, Dec 9, 2024 at 11:16 AM Dujeong.lee
> > > > > > > > <dujeong.lee@...sung.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > >
> > > > > > > > > Thanks for all the details on packetdrill and we are also
> > > > > > > > > exploring
> > > > > > > > USENIX 2013 material.
> > > > > > > > > I have one question. The issue happens when DUT receives
> > > > > > > > > TCP ack with
> > > > > > > > large delay from network, e.g., 28seconds since last Tx. Is
> > > > > > > > packetdrill able to emulate this network delay (or
> > > > > > > > congestion) in script
> > > > > > > level?
> > > > > > > >
> > > > > > > > Yes, the packetdrill scripts can wait an arbitrary amount of
> > > > > > > > time between each event
> > > > > > > >
> > > > > > > > +28 <next event>
> > > > > > > >
> > > > > > > > 28 seconds seems okay. If the issue was triggered after 4
> > > > > > > > days, packetdrill would be impractical ;)
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > We secured new ramdump.
> > > > > > > Please find the below values with TCP header details.
> > > > > > >
> > > > > > > tp->packets_out = 0
> > > > > > > tp->sacked_out = 0
> > > > > > > tp->lost_out = 1
> > > > > > > tp->retrans_out = 1
> > > > > > > tp->rx_opt.sack_ok = 5 (tcp_is_sack(tp)) mss_cache = 1400
> > > > > > > ((struct inet_connection_sock *)sk)->icsk_ca_state = 4
> > > > > > > ((struct inet_connection_sock *)sk)->icsk_pmtu_cookie = 1500
> > > > > > >
> > > > > > > Hex from ip header:
> > > > > > > 45 00 00 40 75 40 00 00 39 06 91 13 8E FB 2A CA C0 A8 00 F7 01
> > > > > > > BB
> > > > > > > A7 CC 51
> > > > > > > F8 63 CC 52 59 6D A6 B0 10 04 04 77 76 00 00 01 01 08 0A 89 72
> > > > > > > C8
> > > > > > > 42
> > > > > > > 62 F5
> > > > > > > F5 D1 01 01 05 0A 52 59 6D A5 52 59 6D A6
> > > > > > >
> > > > > > > Transmission Control Protocol
> > > > > > > Source Port: 443
> > > > > > > Destination Port: 42956
> > > > > > > TCP Segment Len: 0
> > > > > > > Sequence Number (raw): 1375232972 Acknowledgment number (raw):
> > > > > > > 1381592486
> > > > > > > 1011 .... = Header Length: 44 bytes (11)
> > > > > > > Flags: 0x010 (ACK)
> > > > > > > Window: 1028
> > > > > > > Calculated window size: 1028
> > > > > > > Urgent Pointer: 0
> > > > > > > Options: (24 bytes), No-Operation (NOP), No-Operation (NOP),
> > > > > > > Timestamps, No-Operation (NOP), No-Operation (NOP), SACK
> > > > > > >
> > > > > > > If anyone wants to check other values, please feel free to ask
> > > > > > > me
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Dujeong.
> > > > > >
> > > > > > I have a question.
> > > > > >
> > > > > > From the latest ramdump I could see that
> > > > > > 1) tcp_sk(sk)->packets_out = 0
> > > > > > 2) inet_csk(sk)->icsk_backoff = 0
> > > > > > 3) sk_write_queue.len = 0
> > > > > > which suggests that tcp_write_queue_purge was indeed called.
> > > > > >
> > > > > > Noting that:
> > > > > > 1) tcp_write_queue_purge reset packets_out to 0 and
> > > > > > 2) in_flight should be non-negative where in_flight =
> > > > > > packets_out - left_out + retrans_out, what if we reset left_out
> > > > > > and retrans_out as well in tcp_write_queue_purge?
> > > > > >
> > > > > > Do we see any potential issue with this?
> > > > >
> > > > > Hello Eric and Neal.
> > > > >
> > > > > It is a gentle reminder.
> > > > > Could you please review the latest ramdump values and and question?
> > > >
> > > > It will have to wait next year, Neal is OOO.
> > > >
> > > > I asked a packetdrill reproducer, I can not spend days working on an
> > > > issue that does not trigger in our production hosts.
> > > >
> > > > Something could be wrong in your trees, or perhaps some eBPF program
> > > > changing the state of the socket...
> > >
> > > Hi Eric
> > >
> > > I tried to make packetdrill script for local mode, which injects delayed
> > acks for data and FIN after close.
> > >
> > > // Test basic connection teardown where local process closes first:
> > > // the local process calls close() first, so we send a FIN.
> > > // Then we receive an delayed ACK for data and FIN.
> > > // Then we receive a FIN and ACK it.
> > >
> > > `../common/defaults.sh`
> > > 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 //
> > Create socket
> > > +.01...0.011 connect(3, ..., ...) = 0 //
> > Initiate connection
> > > +0 > S 0:0(0) <...> // Send
> > SYN
> > > +0 < S. 0:0(0) ack 1 win 32768 <mss 1000,nop,wscale 6,nop,nop,sackOK>
> > // Receive SYN-ACK with TCP options
> > > +0 > . 1:1(0) ack 1 // Send
> > ACK
> > >
> > > +0 write(3, ..., 1000) = 1000 //
> > Write 1000 bytes
> > > +0 > P. 1:1001(1000) ack 1 // Send
> > data with PSH flag
> > >
> > > +0 close(3) = 0 // Local
> > side initiates close
> > > +0 > F. 1001:1001(0) ack 1 // Send
> > FIN
> > > +1 < . 1:1(0) ack 1001 win 257 //
> > Receive ACK for data
> > > +0 < . 1:1(0) ack 1002 win 257 //
> > Receive ACK for FIN
> > >
> > > +0 < F. 1:1(0) ack 1002 win 257 //
> > Receive FIN from remote
> > > +0 > . 1002:1002(0) ack 2 // Send
> > ACK for FIN
> > >
> > >
> > > But got below error when I run the script.
> > >
> > > $ sudo ./packetdrill ../tcp/close/close-half-delayed-ack.pkt
> > > ../tcp/close/close-half-delayed-ack.pkt:22: error handling packet:
> > > live packet field tcp_fin: expected: 0 (0x0) vs actual: 1 (0x1) script
> > > packet: 1.010997 . 1002:1002(0) ack 2 actual packet: 0.014840 F.
> > > 1001:1001(0) ack 1 win 256
> >
> > This means the FIN was retransmited earlier.
> > Then the data segment was probably also retransmit.
> >
> > You can use "tcpdump -i any &" while developing your script.
> >
> > 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
> > // Create socket
> > +.01...0.111 connect(3, ..., ...) = 0
> > // Initiate connection
> > +0 > S 0:0(0) <...>
> > // Send SYN
> > +.1 < S. 0:0(0) ack 1 win 32768 <mss 1000,nop,wscale
> > 6,nop,nop,sackOK> // Receive SYN-ACK with TCP options
> > +0 > . 1:1(0) ack 1
> > // Send ACK
> >
> > +0 write(3, ..., 1000) = 1000
> > // Write 1000 bytes
> > +0 > P. 1:1001(1000) ack 1
> > // Send data with PSH flag
> >
> > +0 close(3) = 0
> > // Local side initiates close
> > +0 > F. 1001:1001(0) ack 1
> > // Send FIN
> > +.2 > F. 1001:1001(0) ack 1 // FIN retransmit
> > +.2~+.4 > P. 1:1001(1000) ack 1 // RTX
> >
> > +0 < . 1:1(0) ack 1001 win 257
> > // Receive ACK for data
> > +0 > F. 1001:1001(0) ack 1 // FIN retransmit
> > +0 < . 1:1(0) ack 1002 win 257
> > // Receive ACK for FIN
> >
> > +0 < F. 1:1(0) ack 1002 win 257
> > // Receive FIN from remote
> > +0 > . 1002:1002(0) ack 2
> > // Send ACK for FIN
>
> Hi Eric,
>
> I modified the script and inlined tcpdump capture
>
> `../common/defaults.sh`
> 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 // Create socket
> +.01...0.011 connect(3, ..., ...) = 0 // Initiate connection
> +0 > S 0:0(0) <...> // Send SYN
> 1 0.000000 192.168.114.235 192.0.2.1 TCP 80 40784 → 8080 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 SACK_PERM TSval=2913446377 TSecr=0 WS=256
>
> +0 < S. 0:0(0) ack 1 win 32768 <mss 1000,nop,wscale 6,nop,nop,sackOK> // Receive SYN-ACK with TCP options
> 2 0.000209 192.0.2.1 192.168.114.235 TCP 72 8080 → 40784 [SYN, ACK] Seq=0 Ack=1 Win=32768 Len=0 MSS=1000 WS=64 SACK_PERM
>
> +0 > . 1:1(0) ack 1 // Send ACK
> 3 0.000260 192.168.114.235 192.0.2.1 TCP 60 40784 → 8080 [ACK] Seq=1 Ack=1 Win=65536 Len=0
>
> +0 write(3, ..., 1000) = 1000 // Write 1000 bytes
> +0 > P. 1:1001(1000) ack 1 // Send data with PSH flag
> 4 0.000344 192.168.114.235 192.0.2.1 TCP 1060 40784 → 8080 [PSH, ACK] Seq=1 Ack=1 Win=65536 Len=1000
>
> +0 close(3) = 0 // Local side initiates close
> +0 > F. 1001:1001(0) ack 1 // Send FIN
> 5 0.000381 192.168.114.235 192.0.2.1 TCP 60 40784 → 8080 [FIN, ACK] Seq=1001 Ack=1 Win=65536 Len=0
>
> +.2 > F. 1001:1001(0) ack 1 // FIN retransmit
> 6 0.004545 192.168.114.235 192.0.2.1 TCP 60 [TCP Retransmission] 40784 → 8080 [FIN, ACK] Seq=1001 Ack=1 Win=65536 Len=0
>
> +.2~+.4 > P. 1:1001(1000) ack 1 // RTX
> +0 < . 1:1(0) ack 1001 win 257 // Receive ACK for data
> +0 < . 1:1(0) ack 1002 win 257 // Receive ACK for FIN
>
> +0 < F. 1:1(0) ack 1002 win 257 // Receive FIN from remote
> +0 > . 1002:1002(0) ack 2 // Send ACK for FIN
>
>
> And hit below error.
> ../tcp/close/close-half-delayed-ack.pkt:18: error handling packet: timing error: expected outbound packet at 0.210706 sec but happened at 0.014838 sec; tolerance 0.025002 sec
> script packet: 0.210706 F. 1001:1001(0) ack 1
> actual packet: 0.014838 F. 1001:1001(0) ack 1 win 256
>
> For me, it looks like delay in below line does not take effect by packetdrill.
> +.2 > F. 1001:1001(0) ack 1 // FIN retransmit
I think you misunderstood how packetdrill works.
In packetdrill, you can specify delays for incoming packets (to
account for network delays, or remote TCP stack bugs/behavior)
But outgoing packets are generated by the kernel TCP stack.
Packetdrill checks that these packets have the expected layouts and
sent at the expected time.
Powered by blists - more mailing lists