netdev - [PATCH net-next 1/2] tcp: do not set a zero size receive buffer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3e080bdba9981988ff86e120df40a5f0dc6cd033.1752859383.git.pabeni@redhat.com>
Date: Fri, 18 Jul 2025 19:25:12 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: netdev@...r.kernel.org
Cc: Eric Dumazet <edumazet@...gle.com>,
	Neal Cardwell <ncardwell@...gle.com>,
	Kuniyuki Iwashima <kuniyu@...gle.com>,
	"David S. Miller" <davem@...emloft.net>,
	David Ahern <dsahern@...nel.org>,
	Jakub Kicinski <kuba@...nel.org>,
	Simon Horman <horms@...nel.org>,
	Matthieu Baerts <matttbe@...nel.org>
Subject: [PATCH net-next 1/2] tcp: do not set a zero size receive buffer

The nipa CI is reporting frequent failures in the mptcp_connect
self-tests.

In the failing scenarios (TCP -> MPTCP) the involved sockets are
actually plain TCP ones, as fallback for passive socket at 2whs
time cause the MPTCP listener to actually create a TCP socket.

The transfer is stuck due to the receiver buffer being zero.
With the stronger check in place, tcp_clamp_window() can be invoked
while the TCP socket has sk_rmem_alloc == 0, and the receive buffer
will be zeroed, too.

Pass to tcp_clamp_window() even the current skb truesize, so that
such helper could compute and use the actual limit enforced by
the stack.

Fixes: 1d2fbaad7cd8 ("tcp: stronger sk_rcvbuf checks")
Signed-off-by: Paolo Abeni <pabeni@...hat.com>
---
 net/ipv4/tcp_input.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 672cbfbdcec1..c98de02a3c57 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -610,24 +610,24 @@ static void tcp_init_buffer_space(struct sock *sk)
 }
 
 /* 4. Recalculate window clamp after socket hit its memory bounds. */
-static void tcp_clamp_window(struct sock *sk)
+static void tcp_clamp_window(struct sock *sk, int truesize)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct inet_connection_sock *icsk = inet_csk(sk);
 	struct net *net = sock_net(sk);
-	int rmem2;
+	int rmem2, needed;
 
 	icsk->icsk_ack.quick = 0;
 	rmem2 = READ_ONCE(net->ipv4.sysctl_tcp_rmem[2]);
+	needed = atomic_read(&sk->sk_rmem_alloc) + truesize;
 
 	if (sk->sk_rcvbuf < rmem2 &&
 	    !(sk->sk_userlocks & SOCK_RCVBUF_LOCK) &&
 	    !tcp_under_memory_pressure(sk) &&
 	    sk_memory_allocated(sk) < sk_prot_mem_limits(sk, 0)) {
-		WRITE_ONCE(sk->sk_rcvbuf,
-			   min(atomic_read(&sk->sk_rmem_alloc), rmem2));
+		WRITE_ONCE(sk->sk_rcvbuf, min(needed, rmem2));
 	}
-	if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
+	if (needed > sk->sk_rcvbuf)
 		tp->rcv_ssthresh = min(tp->window_clamp, 2U * tp->advmss);
 }
 
@@ -5552,7 +5552,7 @@ static int tcp_prune_queue(struct sock *sk, const struct sk_buff *in_skb)
 	NET_INC_STATS(sock_net(sk), LINUX_MIB_PRUNECALLED);
 
 	if (!tcp_can_ingest(sk, in_skb))
-		tcp_clamp_window(sk);
+		tcp_clamp_window(sk, in_skb->truesize);
 	else if (tcp_under_memory_pressure(sk))
 		tcp_adjust_rcv_ssthresh(sk);
 
-- 
2.50.0