[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFWXMN3WexdQEsjZRGumLuyA8phXmSte_j7JTAjL_v11ZxAmtg@mail.gmail.com>
Date: Wed, 21 Jan 2026 02:03:39 +0530
From: Gopal Malaviya <gopalmalaviya53@...il.com>
To: netdev@...r.kernel.org
Subject: [RFC] net: ipv4: optional early cleanup of half-closed TCP sockets
Hi,
Background:
I am looking into cases where TCP sockets that transition into
half-closed states (CLOSE_WAIT or FIN_WAIT2 after receipt of FIN)
remain available long enough to be reused by userland connection
pools. In some HTTP client workloads, especially those involving
frequent requests with large request bodies, reuse of such sockets
can lead to follow-up failures such as timeouts or premature close
events on subsequent operations.
This behavior is compliant with TCP semantics, but application-level
connection pools may incorrectly assume that a socket is still usable
as long as it has not been explicitly closed.
Problem:
When a remote peer closes its send side early, the local socket
enters a half-closed state as described in RFC 793, RFC 1122, and
RFC 9293. These states are correct and expected. However, sockets
in CLOSE_WAIT or FIN_WAIT2 may persist long enough to be returned
to userland pools, even though practical data exchange is no longer
possible.
For workloads that rely heavily on persistent connection reuse,
this can cause intermittent and difficult-to-diagnose failures.
Proposal:
Introduce an optional sysctl:
net.ipv4.tcp_aggressive_halfclose = 0 (default)
When enabled:
- Upon receiving FIN and transitioning into CLOSE_WAIT or FIN_WAIT2,
the socket is marked as a candidate for early teardown.
- After a short configurable grace period (seconds or keepalive
probes), if the socket remains half-closed, the kernel performs
a normal teardown using existing mechanisms (e.g. tcp_done()).
- Sockets handled in this mode would also avoid TIME_WAIT reuse,
ensuring they are not inadvertently returned to userland.
A secondary sysctl could control the grace interval, for example:
net.ipv4.tcp_aggressive_halfclose_grace = <seconds>
Default TCP behavior remains unchanged unless explicitly enabled.
Rationale:
The intent is to provide an opt-in mechanism for environments where
reuse of half-closed sockets interacts poorly with application-managed
connection pools. The proposal does not modify semantics for established
connections, connection setup, or orderly close initiated locally.
RFC 793, RFC 1122, and RFC 9293 define the TCP state machine and
half-close behavior but allow implementations flexibility in resource
management and socket lifetime. This proposal aims to use that
flexibility in a narrowly-scoped and optional manner.
Implementation notes (initial thoughts):
- Tag sockets on FIN reception when entering CLOSE_WAIT or FIN_WAIT2.
- Apply a short timer or probe-based grace period.
- On expiry, perform standard teardown.
- Avoid TIME_WAIT reuse for sockets marked for aggressive half-close.
- Keep all behavior gated behind sysctl(s).
Request for feedback:
Before preparing a full patch series, I would appreciate feedback on:
- Whether the general idea is acceptable as an opt-in extension.
- Preferred naming and placement of the sysctl(s).
- Whether a grace period is preferred over immediate teardown.
- Any interactions with existing timers or state transitions
that should be considered.
- Any related prior discussions worth reviewing.
Thanks for your time and guidance.
Gopal Malaviya
Powered by blists - more mailing lists