linux-kernel - [PATCH 2/2] tcp: fix forever orphan socket caused by tcp

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20250314092446.852230-2-youngmin.nam@samsung.com>
Date: Fri, 14 Mar 2025 18:24:46 +0900
From: Youngmin Nam <youngmin.nam@...sung.com>
To: stable@...r.kernel.org
Cc: ncardwell@...gle.com, edumazet@...gle.com, kuba@...nel.org,
	davem@...emloft.net, dsahern@...nel.org, pabeni@...hat.com,
	horms@...nel.org, guo88.liu@...sung.com, yiwang.cai@...sung.com,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
	joonki.min@...sung.com, hajun.sung@...sung.com, d7271.choe@...sung.com,
	sw.ju@...sung.com, dujeong.lee@...sung.com, ycheng@...gle.com,
	yyd@...gle.com, kuro@...oa.me, youngmin.nam@...sung.com,
	cmllamas@...gle.com, willdeacon@...gle.com, maennich@...gle.com,
	gregkh@...gle.com, Lorenzo Colitti <lorenzo@...gle.com>, Jason Xing
	<kerneljasonxing@...il.com>
Subject: [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort

From: Xueming Feng <kuro@...oa.me>

commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4 upstream.

We have some problem closing zero-window fin-wait-1 tcp sockets in our
environment. This patch come from the investigation.

Previously tcp_abort only sends out reset and calls tcp_done when the
socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only
purging the write queue, but not close the socket and left it to the
timer.

While purging the write queue, tp->packets_out and sk->sk_write_queue
is cleared along the way. However tcp_retransmit_timer have early
return based on !tp->packets_out and tcp_probe_timer have early
return based on !sk->sk_write_queue.

This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched
and socket not being killed by the timers, converting a zero-windowed
orphan into a forever orphan.

This patch removes the SOCK_DEAD check in tcp_abort, making it send
reset to peer and close the socket accordingly. Preventing the
timer-less orphan from happening.

According to Lorenzo's email in the v1 thread, the check was there to
prevent force-closing the same socket twice. That situation is handled
by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is
already closed.

The -ENOENT code comes from the associate patch Lorenzo made for
iproute2-ss; link attached below, which also conform to RFC 9293.

At the end of the patch, tcp_write_queue_purge(sk) is removed because it
was already called in tcp_done_with_error().

p.s. This is the same patch with v2. Resent due to mis-labeled "changes
requested" on patchwork.kernel.org.

Link: https://patchwork.ozlabs.org/project/netdev/patch/1450773094-7978-3-git-send-email-lorenzo@google.com/
Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.")
Signed-off-by: Xueming Feng <kuro@...oa.me>
Tested-by: Lorenzo Colitti <lorenzo@...gle.com>
Reviewed-by: Jason Xing <kerneljasonxing@...il.com>
Reviewed-by: Eric Dumazet <edumazet@...gle.com>
Link: https://patch.msgid.link/20240826102327.1461482-1-kuro@kuroa.me
Signed-off-by: Jakub Kicinski <kuba@...nel.org>
Cc: <stable@...r.kernel.org> # v5.10+
Link: https://lore.kernel.org/lkml/Z9OZS%2Fhc+v5og6%2FU@perf/
[youngmin: Resolved minor conflict in net/ipv4/tcp.c]
Signed-off-by: Youngmin Nam <youngmin.nam@...sung.com>
---
 net/ipv4/tcp.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9fe164aa185c..ff22060f9145 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4620,6 +4620,13 @@ int tcp_abort(struct sock *sk, int err)
 		/* Don't race with userspace socket closes such as tcp_close. */
 		lock_sock(sk);

+	/* Avoid closing the same socket twice. */
+	if (sk->sk_state == TCP_CLOSE) {
+		if (!has_current_bpf_ctx())
+			release_sock(sk);
+		return -ENOENT;
+	}
+
 	if (sk->sk_state == TCP_LISTEN) {
 		tcp_set_state(sk, TCP_CLOSE);
 		inet_csk_listen_stop(sk);
@@ -4629,15 +4636,12 @@ int tcp_abort(struct sock *sk, int err)
 	local_bh_disable();
 	bh_lock_sock(sk);

-	if (!sock_flag(sk, SOCK_DEAD)) {
-		if (tcp_need_reset(sk->sk_state))
-			tcp_send_active_reset(sk, GFP_ATOMIC);
-		tcp_done_with_error(sk, err);
-	}
+	if (tcp_need_reset(sk->sk_state))
+		tcp_send_active_reset(sk, GFP_ATOMIC);
+	tcp_done_with_error(sk, err);

 	bh_unlock_sock(sk);
 	local_bh_enable();
-	tcp_write_queue_purge(sk);
 	if (!has_current_bpf_ctx())
 		release_sock(sk);
 	return 0;
-- 
2.39.2