lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAAup_245hpq0Z52Q-N7ecoen=x9yCSDBMHiRdk_uSf5V4t0qnA@mail.gmail.com>
Date: Mon, 10 Mar 2025 17:13:16 +0100
From: Marko Pacaric <mpacaric2@...il.com>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Issue with Missing Final ACK in TCP Connections to gracefully close the TCP connection – Regression in 5.15.123 (15251e78)?

Dear Linux Community,

I’m currently investigating a complex issue related to TCP connection
termination and would appreciate your insights to help clarify the
situation.

Background

 We have a complex network setup involving multiple hops and a special
implementation on the mobile provider side.

 Given this setup, we rely heavily on TCP connections being gracefully
closed—following the standard FIN → FIN-ACK → ACK sequence.

 The Problem

 For the past few months, we have been struggling with an issue where
the final ACK from our client is not being sent, leaving the
connection stuck in CLOSE_WAIT or LAST_ACK state on the receiver's
side. This behavior is causing significant issues in our system.

 What We've Tried

1.       Application-Level Investigation

Initially, we suspected an issue in our implementation.

We rebuilt several applications, but the final ACK was consistently missing.

We even tested with different programming languages and libraries—same issue.

2.       Kernel & TCP Stack Configuration

We modified various TCP stack parameters, but none of the changes
resolved the problem.

Findings

To isolate the issue, we started testing with different Linux kernel versions:

Our current used version5.15.123
(KERNEL.PLATFORM.2.0.r10-05800-kernel.0) → Final ACK is missing,
connections remain in FIN_WAIT_2.

5.15.104 (KERNEL.PLATFORM.2.0.r10-04600-kernel.0) → Final ACK is sent,
connections close correctly.

Through systematic downgrading, we identified that 5.15.104 is the
last version where the issue does not occur. We then analyzed the
commits between these versions and found that the issue seems to be
introduced by the following commit:
Commit 15251e783a4b
https://git.codelinaro.org/clo/la/kernel/msm-5.15/-/commit/15251e783a4b

What did we do now
We build ourselfs a small patch which reverts the commit, so that we
can see if the false behavior is revered.
After applying the following patch, we do no longer observe the missing ACKs:
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c

index 420d3bdeaa1b..0580d8719f37 100644

--- a/net/ipv4/tcp_ipv4.c

+++ b/net/ipv4/tcp_ipv4.c

@@ -923,7 +923,6 @@ static void tcp_v4_send_ack(const struct sock *sk,

                              &arg, arg.iov[0].iov_len,

                              transmit_time);

-       sock_net_set(ctl_sk, &init_net);

        __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);

        local_bh_enable();

}

Next Steps & Questions

1.       Could this commit be responsible for suppressing the final
ACK in certain network conditions?

2.       Has anyone else observed similar behavior in recent kernel versions?

3.       Are there any known workarounds or patches addressing this issue?

Any insights or suggestions on how to proceed would be greatly appreciated!

BUG - 219849

Thank you very much,

Marko

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ