[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAAup_245hpq0Z52Q-N7ecoen=x9yCSDBMHiRdk_uSf5V4t0qnA@mail.gmail.com>
Date: Mon, 10 Mar 2025 17:13:16 +0100
From: Marko Pacaric <mpacaric2@...il.com>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Issue with Missing Final ACK in TCP Connections to gracefully close the TCP connection – Regression in 5.15.123 (15251e78)?
Dear Linux Community,
I’m currently investigating a complex issue related to TCP connection
termination and would appreciate your insights to help clarify the
situation.
Background
We have a complex network setup involving multiple hops and a special
implementation on the mobile provider side.
Given this setup, we rely heavily on TCP connections being gracefully
closed—following the standard FIN → FIN-ACK → ACK sequence.
The Problem
For the past few months, we have been struggling with an issue where
the final ACK from our client is not being sent, leaving the
connection stuck in CLOSE_WAIT or LAST_ACK state on the receiver's
side. This behavior is causing significant issues in our system.
What We've Tried
1. Application-Level Investigation
Initially, we suspected an issue in our implementation.
We rebuilt several applications, but the final ACK was consistently missing.
We even tested with different programming languages and libraries—same issue.
2. Kernel & TCP Stack Configuration
We modified various TCP stack parameters, but none of the changes
resolved the problem.
Findings
To isolate the issue, we started testing with different Linux kernel versions:
Our current used version5.15.123
(KERNEL.PLATFORM.2.0.r10-05800-kernel.0) → Final ACK is missing,
connections remain in FIN_WAIT_2.
5.15.104 (KERNEL.PLATFORM.2.0.r10-04600-kernel.0) → Final ACK is sent,
connections close correctly.
Through systematic downgrading, we identified that 5.15.104 is the
last version where the issue does not occur. We then analyzed the
commits between these versions and found that the issue seems to be
introduced by the following commit:
Commit 15251e783a4b
https://git.codelinaro.org/clo/la/kernel/msm-5.15/-/commit/15251e783a4b
What did we do now
We build ourselfs a small patch which reverts the commit, so that we
can see if the false behavior is revered.
After applying the following patch, we do no longer observe the missing ACKs:
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 420d3bdeaa1b..0580d8719f37 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -923,7 +923,6 @@ static void tcp_v4_send_ack(const struct sock *sk,
&arg, arg.iov[0].iov_len,
transmit_time);
- sock_net_set(ctl_sk, &init_net);
__TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
local_bh_enable();
}
Next Steps & Questions
1. Could this commit be responsible for suppressing the final
ACK in certain network conditions?
2. Has anyone else observed similar behavior in recent kernel versions?
3. Are there any known workarounds or patches addressing this issue?
Any insights or suggestions on how to proceed would be greatly appreciated!
BUG - 219849
Thank you very much,
Marko
Powered by blists - more mailing lists