[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANn89iJJkpSPMeK7PFH6Hrs=0Hw3Np1haR-+6GOhPwmvsq9x5Q@mail.gmail.com>
Date: Wed, 27 Aug 2025 20:15:59 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: "Ahmed, Shehab Sarar" <shehaba2@...inois.edu>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>, "ncardwell@...gle.com" <ncardwell@...gle.com>,
"kuniyu@...gle.com" <kuniyu@...gle.com>
Subject: Re: [BUG] TCP: Duplicate ACK storm after reordering with delayed
packet (BBR RTO triggered)
On Wed, Aug 27, 2025 at 6:12 PM Ahmed, Shehab Sarar
<shehaba2@...inois.edu> wrote:
>
> Hello,
>
> I am a PhD student doing research on adversarial testing of different TCP protocols. Recently, I found an interesting behavior of TCP that I am describing below:
>
> The network RTT was high for about a second before it was abruptly reduced. Some packets sent during the high RTT phase experienced long delays in reaching the destination, while later packets, benefiting from the lower RTT, arrived earlier. This out-of-order arrival triggered the receiver to generate duplicate acknowledgments (dup ACKs). Due to the low RTT, these dup ACKs quickly reached the sender. Upon receiving three dup ACKs, the sender initiated a fast retransmission for an earlier packet that was not lost but was simply taking longer to arrive. Interestingly, despite the fast-retransmitted packet experienced a lower RTT, the original delayed packet still arrived first. When the receiver received this packet, it sent an ACK for the next packet in sequence. However, upon later receiving the fast-retransmitted packet, an issue arose in its logic for updating the acknowledgment number. As a result, even after the next expected packet was received, the acknowledgment number was not updated correctly. The receiver continued sending dup ACKs, ultimately forcing the congestion control protocol into the retransmission timeout (RTO) phase.
>
> I experienced this behavior in linux kernel 5.4.230 version and was wondering if the same issue persists in the recent-most kernel. Do you know of any commit that addressed this issue? If not, I am highly enthusiastic to investigate further. My suspicion is that the problem lies in tcp_input.c. I will be eagerly waiting for your reply.
I really wonder why anyone would do any research on v5.4.230, a more
than 2 years old kernel, clearly unsupported.
I suggest you write a packetdrill test to exhibit the issue, then run
a reverse bisection to find the commit fixing it (assuming recent
kernels are fixed).
There are about 8200 patches between v5.4.230 and v5.4.296, a
bisection should be fast.
>
> Thanks
> Shehab
Powered by blists - more mailing lists