lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANn89iK8Kdpe_uZ2Q8z3k2=d=jUVCV5Z3hZa4jFedUgKm9hesQ@mail.gmail.com>
Date: Wed, 18 Dec 2024 11:27:58 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: "Dujeong.lee" <dujeong.lee@...sung.com>
Cc: Youngmin Nam <youngmin.nam@...sung.com>, Jakub Kicinski <kuba@...nel.org>, 
	Neal Cardwell <ncardwell@...gle.com>, davem@...emloft.net, dsahern@...nel.org, 
	pabeni@...hat.com, horms@...nel.org, guo88.liu@...sung.com, 
	yiwang.cai@...sung.com, netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	joonki.min@...sung.com, hajun.sung@...sung.com, d7271.choe@...sung.com, 
	sw.ju@...sung.com, iamyunsu.kim@...sung.com, kw0619.kim@...sung.com, 
	hsl.lim@...sung.com, hanbum22.lee@...sung.com, chaemoo.lim@...sung.com, 
	seungjin1.yu@...sung.com
Subject: Re: [PATCH] tcp: check socket state before calling WARN_ON

On Wed, Dec 18, 2024 at 11:18 AM Dujeong.lee <dujeong.lee@...sung.com> wrote:
>
> Tue, December 10, 2024 at 4:10 PM Dujeong Lee wrote:
> > On Tue, Dec 10, 2024 at 12:39 PM Dujeong Lee wrote:
> > > On Mon, Dec 9, 2024 at 7:21 PM Eric Dumazet <edumazet@...gle.com> wrote:
> > > > On Mon, Dec 9, 2024 at 11:16 AM Dujeong.lee
> > > > <dujeong.lee@...sung.com>
> > > > wrote:
> > > > >
> > > >
> > > > > Thanks for all the details on packetdrill and we are also
> > > > > exploring
> > > > USENIX 2013 material.
> > > > > I have one question. The issue happens when DUT receives TCP ack
> > > > > with
> > > > large delay from network, e.g., 28seconds since last Tx. Is
> > > > packetdrill able to emulate this network delay (or congestion) in
> > > > script
> > > level?
> > > >
> > > > Yes, the packetdrill scripts can wait an arbitrary amount of time
> > > > between each event
> > > >
> > > > +28 <next event>
> > > >
> > > > 28 seconds seems okay. If the issue was triggered after 4 days,
> > > > packetdrill would be impractical ;)
> > >
> > > Hi all,
> > >
> > > We secured new ramdump.
> > > Please find the below values with TCP header details.
> > >
> > > tp->packets_out = 0
> > > tp->sacked_out = 0
> > > tp->lost_out = 1
> > > tp->retrans_out = 1
> > > tp->rx_opt.sack_ok = 5 (tcp_is_sack(tp)) mss_cache = 1400
> > > ((struct inet_connection_sock *)sk)->icsk_ca_state = 4 ((struct
> > > inet_connection_sock *)sk)->icsk_pmtu_cookie = 1500
> > >
> > > Hex from ip header:
> > > 45 00 00 40 75 40 00 00 39 06 91 13 8E FB 2A CA C0 A8 00 F7 01 BB A7
> > > CC 51
> > > F8 63 CC 52 59 6D A6 B0 10 04 04 77 76 00 00 01 01 08 0A 89 72 C8 42
> > > 62 F5
> > > F5 D1 01 01 05 0A 52 59 6D A5 52 59 6D A6
> > >
> > > Transmission Control Protocol
> > > Source Port: 443
> > > Destination Port: 42956
> > > TCP Segment Len: 0
> > > Sequence Number (raw): 1375232972
> > > Acknowledgment number (raw): 1381592486
> > > 1011 .... = Header Length: 44 bytes (11)
> > > Flags: 0x010 (ACK)
> > > Window: 1028
> > > Calculated window size: 1028
> > > Urgent Pointer: 0
> > > Options: (24 bytes), No-Operation (NOP), No-Operation (NOP),
> > > Timestamps, No-Operation (NOP), No-Operation (NOP), SACK
> > >
> > > If anyone wants to check other values, please feel free to ask me
> > >
> > > Thanks,
> > > Dujeong.
> >
> > I have a question.
> >
> > From the latest ramdump I could see that
> > 1) tcp_sk(sk)->packets_out = 0
> > 2) inet_csk(sk)->icsk_backoff = 0
> > 3) sk_write_queue.len = 0
> > which suggests that tcp_write_queue_purge was indeed called.
> >
> > Noting that:
> > 1) tcp_write_queue_purge reset packets_out to 0 and
> > 2) in_flight should be non-negative where in_flight = packets_out -
> > left_out + retrans_out, what if we reset left_out and retrans_out as well
> > in tcp_write_queue_purge?
> >
> > Do we see any potential issue with this?
>
> Hello Eric and Neal.
>
> It is a gentle reminder.
> Could you please review the latest ramdump values and and question?

It will have to wait next year, Neal is OOO.

I asked a packetdrill reproducer, I can not spend days working on an
issue that does not trigger in our production hosts.

Something could be wrong in your trees, or perhaps some eBPF program
changing the state of the socket...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ