lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240801111611.84743-1-kuro@kuroa.me>
Date: Thu,  1 Aug 2024 19:16:11 +0800
From: Xueming Feng <kuro@...oa.me>
To: "David S . Miller" <davem@...emloft.net>,
	netdev@...r.kernel.org
Cc: Eric Dumazet <edumazet@...gle.com>,
	Neal Cardwell <ncardwell@...gle.com>,
	Yuchung Cheng <ycheng@...gle.com>,
	Soheil Hassas Yeganeh <soheil@...gle.com>,
	David Ahern <dsahern@...nel.org>,
	linux-kernel@...r.kernel.org,
	Xueming Feng <kuro@...oa.me>
Subject: [PATCH net] tcp: fix forever orphan socket caused by tcp_abort

We have some problem closing zero-window fin-wait-1 tcp sockets in our 
environment. This patch come from the investigation.

Previously tcp_abort only sends out reset and calls tcp_done when the 
socket is not SOCK_DEAD aka. orphan. For orphan socket, it will only 
purging the write queue, but not close the socket and left it to the 
timer.

While purging the write queue, tp->packets_out and sk->sk_write_queue 
is cleared along the way. However tcp_retransmit_timer have early 
return based on !tp->packets_out and tcp_probe_timer have early 
return based on !sk->sk_write_queue.

This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched 
and socket not being killed by the timers. Converting a zero-windowed
orphan to a forever orphan.

This patch removes the SOCK_DEAD check in tcp_abort, making it send 
reset to peer and close the socket accordingly. Preventing the 
timer-less orphan from happening.

Fixes: e05836ac07c7 ("tcp: purge write queue upon aborting the connection")
Fixes: bffd168c3fc5 ("tcp: clear tp->packets_out when purging write queue")
Signed-off-by: Xueming Feng <kuro@...oa.me>
---
 net/ipv4/tcp.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index e03a342c9162..65e8d28d15b1 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4646,12 +4646,10 @@ int tcp_abort(struct sock *sk, int err)
 	local_bh_disable();
 	bh_lock_sock(sk);
 
-	if (!sock_flag(sk, SOCK_DEAD)) {
-		if (tcp_need_reset(sk->sk_state))
-			tcp_send_active_reset(sk, GFP_ATOMIC,
-					      SK_RST_REASON_NOT_SPECIFIED);
-		tcp_done_with_error(sk, err);
-	}
+	if (tcp_need_reset(sk->sk_state))
+		tcp_send_active_reset(sk, GFP_ATOMIC,
+					SK_RST_REASON_NOT_SPECIFIED);
+	tcp_done_with_error(sk, err);
 
 	bh_unlock_sock(sk);
 	local_bh_enable();
-- 
2.39.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ