lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1a43cc72-126a-41d3-8af9-b1a3a303386a@nvidia.com>
Date: Fri, 24 Oct 2025 17:12:55 +0200
From: Dragos Tatulea <dtatulea@...dia.com>
To: Eric Dumazet <edumazet@...gle.com>, "David S . Miller"
 <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>
Cc: Simon Horman <horms@...nel.org>, Neal Cardwell <ncardwell@...gle.com>,
 Willem de Bruijn <willemb@...gle.com>, Kuniyuki Iwashima
 <kuniyu@...gle.com>, Matthieu Baerts <matttbe@...nel.org>,
 Mat Martineau <martineau@...nel.org>, Geliang Tang <geliang@...nel.org>,
 netdev@...r.kernel.org, eric.dumazet@...il.com
Subject: Re: [PATCH net-next 3/3] tcp: fix too slow tcp_rcvbuf_grow() action

On 24.10.25 09:50, Eric Dumazet wrote:
> While the blamed commits apparently avoided an overshoot,
> they also limited how fast a sender can increase BDP at each RTT.
> 
> This is not exactly a revert, we do not add the 16 * tp->advmss
> cushion we had, and we are keeping the out_of_order_queue
> contribution.
> 
> Do the same in mptcp_rcvbuf_grow().
> 
> Tested:
> 
> emulated 50ms rtt (tcp_stream --tcp-tx-delay 50000), cubic 20 second flow.
> net.ipv4.tcp_rmem set to "4096 131072 67000000"
> 
> perf record -a -e tcp:tcp_rcvbuf_grow sleep 20
> perf script
> 
> Before:
> 
> We can see we fail to roughly double RWIN at each RTT.
> Sender is RWIN limited while CWND is ramping up (before getting tcp_wmem limited)
> 
> tcp_stream 33793 [010]  825.717525: tcp:tcp_rcvbuf_grow: time=100869 rtt_us=50428 copied=49152 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=103970 window_clamp=112128 rcv_wnd=106496
> tcp_stream 33793 [010]  825.768966: tcp:tcp_rcvbuf_grow: time=51447 rtt_us=50362 copied=86016 inq=0 space=49152 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=107474 window_clamp=112128 rcv_wnd=106496
> tcp_stream 33793 [010]  825.821539: tcp:tcp_rcvbuf_grow: time=52577 rtt_us=50243 copied=114688 inq=0 space=86016 ooo=0 scaling_ratio=219 rcvbuf=201096 rcv_ssthresh=167377 window_clamp=172031 rcv_wnd=167936
> tcp_stream 33793 [010]  825.871781: tcp:tcp_rcvbuf_grow: time=50248 rtt_us=50237 copied=167936 inq=0 space=114688 ooo=0 scaling_ratio=219 rcvbuf=268129 rcv_ssthresh=224722 window_clamp=229375 rcv_wnd=225280
> tcp_stream 33793 [010]  825.922475: tcp:tcp_rcvbuf_grow: time=50698 rtt_us=50183 copied=241664 inq=0 space=167936 ooo=0 scaling_ratio=219 rcvbuf=392617 rcv_ssthresh=331217 window_clamp=335871 rcv_wnd=323584
> tcp_stream 33793 [010]  825.973326: tcp:tcp_rcvbuf_grow: time=50855 rtt_us=50213 copied=339968 inq=0 space=241664 ooo=0 scaling_ratio=219 rcvbuf=564986 rcv_ssthresh=478674 window_clamp=483327 rcv_wnd=462848
> tcp_stream 33793 [010]  826.023970: tcp:tcp_rcvbuf_grow: time=50647 rtt_us=50248 copied=491520 inq=0 space=339968 ooo=0 scaling_ratio=219 rcvbuf=794811 rcv_ssthresh=671778 window_clamp=679935 rcv_wnd=651264
> tcp_stream 33793 [010]  826.074612: tcp:tcp_rcvbuf_grow: time=50648 rtt_us=50227 copied=700416 inq=0 space=491520 ooo=0 scaling_ratio=219 rcvbuf=1149124 rcv_ssthresh=974881 window_clamp=983039 rcv_wnd=942080
> tcp_stream 33793 [010]  826.125452: tcp:tcp_rcvbuf_grow: time=50845 rtt_us=50225 copied=987136 inq=8192 space=700416 ooo=0 scaling_ratio=219 rcvbuf=1637502 rcv_ssthresh=1392674 window_clamp=1400831 rcv_wnd=1339392
> tcp_stream 33793 [010]  826.175698: tcp:tcp_rcvbuf_grow: time=50250 rtt_us=50198 copied=1347584 inq=0 space=978944 ooo=0 scaling_ratio=219 rcvbuf=2288672 rcv_ssthresh=1949729 window_clamp=1957887 rcv_wnd=1945600
> tcp_stream 33793 [010]  826.225947: tcp:tcp_rcvbuf_grow: time=50252 rtt_us=50240 copied=1945600 inq=0 space=1347584 ooo=0 scaling_ratio=219 rcvbuf=3150516 rcv_ssthresh=2687010 window_clamp=2695167 rcv_wnd=2691072
> tcp_stream 33793 [010]  826.276175: tcp:tcp_rcvbuf_grow: time=50233 rtt_us=50224 copied=2691072 inq=0 space=1945600 ooo=0 scaling_ratio=219 rcvbuf=4548617 rcv_ssthresh=3883041 window_clamp=3891199 rcv_wnd=3887104
> tcp_stream 33793 [010]  826.326403: tcp:tcp_rcvbuf_grow: time=50233 rtt_us=50229 copied=3887104 inq=0 space=2691072 ooo=0 scaling_ratio=219 rcvbuf=6291456 rcv_ssthresh=5370482 window_clamp=5382144 rcv_wnd=5373952
> tcp_stream 33793 [010]  826.376723: tcp:tcp_rcvbuf_grow: time=50323 rtt_us=50218 copied=5373952 inq=0 space=3887104 ooo=0 scaling_ratio=219 rcvbuf=9087658 rcv_ssthresh=7755537 window_clamp=7774207 rcv_wnd=7757824
> tcp_stream 33793 [010]  826.426991: tcp:tcp_rcvbuf_grow: time=50274 rtt_us=50196 copied=7757824 inq=180224 space=5373952 ooo=0 scaling_ratio=219 rcvbuf=12563759 rcv_ssthresh=10729233 window_clamp=10747903 rcv_wnd=10575872
> tcp_stream 33793 [010]  826.477229: tcp:tcp_rcvbuf_grow: time=50241 rtt_us=50078 copied=10731520 inq=180224 space=7577600 ooo=0 scaling_ratio=219 rcvbuf=17715667 rcv_ssthresh=15136529 window_clamp=15155199 rcv_wnd=14983168
> tcp_stream 33793 [010]  826.527482: tcp:tcp_rcvbuf_grow: time=50258 rtt_us=50153 copied=15138816 inq=360448 space=10551296 ooo=0 scaling_ratio=219 rcvbuf=24667870 rcv_ssthresh=21073410 window_clamp=21102591 rcv_wnd=20766720
> tcp_stream 33793 [010]  826.577712: tcp:tcp_rcvbuf_grow: time=50234 rtt_us=50228 copied=21073920 inq=0 space=14778368 ooo=0 scaling_ratio=219 rcvbuf=34550339 rcv_ssthresh=29517041 window_clamp=29556735 rcv_wnd=29519872
> tcp_stream 33793 [010]  826.627982: tcp:tcp_rcvbuf_grow: time=50275 rtt_us=50220 copied=29519872 inq=540672 space=21073920 ooo=0 scaling_ratio=219 rcvbuf=49268707 rcv_ssthresh=42090625 window_clamp=42147839 rcv_wnd=41627648
> tcp_stream 33793 [010]  826.678274: tcp:tcp_rcvbuf_grow: time=50296 rtt_us=50185 copied=42053632 inq=761856 space=28979200 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57238168 window_clamp=57316406 rcv_wnd=56606720
> tcp_stream 33793 [010]  826.728627: tcp:tcp_rcvbuf_grow: time=50357 rtt_us=50128 copied=43913216 inq=851968 space=41291776 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56524800
> tcp_stream 33793 [010]  827.131364: tcp:tcp_rcvbuf_grow: time=50239 rtt_us=50127 copied=43843584 inq=655360 space=43061248 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56696832
> tcp_stream 33793 [010]  827.181613: tcp:tcp_rcvbuf_grow: time=50254 rtt_us=50115 copied=43843584 inq=524288 space=43188224 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56807424
> tcp_stream 33793 [010]  828.339635: tcp:tcp_rcvbuf_grow: time=50283 rtt_us=50110 copied=43843584 inq=458752 space=43319296 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56864768
> tcp_stream 33793 [010]  828.440350: tcp:tcp_rcvbuf_grow: time=50404 rtt_us=50099 copied=43843584 inq=393216 space=43384832 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56922112
> tcp_stream 33793 [010]  829.195106: tcp:tcp_rcvbuf_grow: time=50154 rtt_us=50077 copied=43843584 inq=196608 space=43450368 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=57090048
> 
> After:
> 
> It takes few steps to increase RWIN. Sender is no longer RWIN limited.
> 
> tcp_stream 50826 [010]  935.634212: tcp:tcp_rcvbuf_grow: time=100788 rtt_us=50315 copied=49152 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=103970 window_clamp=112128 rcv_wnd=106496
> tcp_stream 50826 [010]  935.685642: tcp:tcp_rcvbuf_grow: time=51437 rtt_us=50361 copied=86016 inq=0 space=49152 ooo=0 scaling_ratio=219 rcvbuf=160875 rcv_ssthresh=132969 window_clamp=137623 rcv_wnd=131072
> tcp_stream 50826 [010]  935.738299: tcp:tcp_rcvbuf_grow: time=52660 rtt_us=50256 copied=139264 inq=0 space=86016 ooo=0 scaling_ratio=219 rcvbuf=502741 rcv_ssthresh=411497 window_clamp=430079 rcv_wnd=413696
> tcp_stream 50826 [010]  935.788544: tcp:tcp_rcvbuf_grow: time=50249 rtt_us=50233 copied=307200 inq=0 space=139264 ooo=0 scaling_ratio=219 rcvbuf=728690 rcv_ssthresh=618717 window_clamp=623371 rcv_wnd=618496
> tcp_stream 50826 [010]  935.838796: tcp:tcp_rcvbuf_grow: time=50258 rtt_us=50202 copied=618496 inq=0 space=307200 ooo=0 scaling_ratio=219 rcvbuf=2450338 rcv_ssthresh=1855709 window_clamp=2096187 rcv_wnd=1859584
> tcp_stream 50826 [010]  935.889140: tcp:tcp_rcvbuf_grow: time=50347 rtt_us=50166 copied=1261568 inq=0 space=618496 ooo=0 scaling_ratio=219 rcvbuf=4376503 rcv_ssthresh=3725291 window_clamp=3743961 rcv_wnd=3706880
> tcp_stream 50826 [010]  935.939435: tcp:tcp_rcvbuf_grow: time=50300 rtt_us=50185 copied=2478080 inq=24576 space=1261568 ooo=0 scaling_ratio=219 rcvbuf=9082648 rcv_ssthresh=7733731 window_clamp=7769921 rcv_wnd=7692288
> tcp_stream 50826 [010]  935.989681: tcp:tcp_rcvbuf_grow: time=50251 rtt_us=50221 copied=4915200 inq=114688 space=2453504 ooo=0 scaling_ratio=219 rcvbuf=16574936 rcv_ssthresh=14108110 window_clamp=14179339 rcv_wnd=14024704
> tcp_stream 50826 [010]  936.039967: tcp:tcp_rcvbuf_grow: time=50289 rtt_us=50279 copied=9830400 inq=114688 space=4800512 ooo=0 scaling_ratio=219 rcvbuf=32695050 rcv_ssthresh=27896187 window_clamp=27969593 rcv_wnd=27815936
> tcp_stream 50826 [010]  936.090172: tcp:tcp_rcvbuf_grow: time=50211 rtt_us=50200 copied=19841024 inq=114688 space=9715712 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57245176 window_clamp=57316406 rcv_wnd=57163776
> tcp_stream 50826 [010]  936.140430: tcp:tcp_rcvbuf_grow: time=50262 rtt_us=50197 copied=39501824 inq=114688 space=19726336 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57245176 window_clamp=57316406 rcv_wnd=57163776
> tcp_stream 50826 [010]  936.190527: tcp:tcp_rcvbuf_grow: time=50101 rtt_us=50071 copied=43655168 inq=262144 space=39387136 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57032704
> tcp_stream 50826 [010]  936.240719: tcp:tcp_rcvbuf_grow: time=50197 rtt_us=50057 copied=43843584 inq=262144 space=43393024 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57032704
> tcp_stream 50826 [010]  936.341271: tcp:tcp_rcvbuf_grow: time=50297 rtt_us=50123 copied=43843584 inq=131072 space=43581440 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57147392
> tcp_stream 50826 [010]  936.642503: tcp:tcp_rcvbuf_grow: time=50131 rtt_us=50084 copied=43843584 inq=0 space=43712512 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57262080
> 
> Fixes: 65c5287892e9 ("tcp: fix sk_rcvbuf overshoot")
> Fixes: e118cdc34dd1 ("mptcp: rcvbuf auto-tuning improvement")
> Reported-by: Neal Cardwell <ncardwell@...gle.com>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> ---
>  net/ipv4/tcp_input.c | 8 +++++++-
>  net/mptcp/protocol.c | 7 +++++++
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index c8cfd700990f28a8bc64e4353a2c78a82bb6bcb2..f004072654a4c50da14b9dafc46133feb71f12cd 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -896,6 +896,7 @@ void tcp_rcvbuf_grow(struct sock *sk, u32 newval)
>  	const struct net *net = sock_net(sk);
>  	struct tcp_sock *tp = tcp_sk(sk);
>  	u32 rcvwin, rcvbuf, cap, oldval;
> +	u64 grow;
>  
>  	oldval = tp->rcvq_space.space;
>  	tp->rcvq_space.space = newval;
> @@ -904,9 +905,14 @@ void tcp_rcvbuf_grow(struct sock *sk, u32 newval)
>  	    (sk->sk_userlocks & SOCK_RCVBUF_LOCK))
>  		return;
>  
> -	/* slow start: allow the sender to double its rate. */
> +	/* DRS is always one RTT late. */
>  	rcvwin = newval << 1;
>  
> +	/* slow start: allow the sender to double its rate. */
> +	grow = (u64)rcvwin * (newval - oldval);
> +	do_div(grow, oldval);
> +	rcvwin += grow << 1;
> +
>  	if (!RB_EMPTY_ROOT(&tp->out_of_order_queue))
>  		rcvwin += TCP_SKB_CB(tp->ooo_last_skb)->end_seq - tp->rcv_nxt;
>  
Hi Eric,

When applying this series I see a regression in a simple 25G iperf test:
retransmissions are seen due to packet drops (out of buffer) on the
server side.

The test:
- server: iperf3 -s -A 5
- client: iperf3 -c 1.1.1.1 -B 25G
- Configuration:
  - Server has a single queue with affinity set on CPU 5.
  - Ring size: 1K (4K ring size seems ok)
  - MTU: 1500
  - Client uses TSO, server uses SW GRO.

Before series (includes first patch):
<...>-2192  [005]   162.451893: tcp_rcvbuf_grow: time=1622 rtt_us=1596 copied=76781 inq=30408 space=14480 ooo=0 scaling_ratio=188 rcvbuf=131072 rcv_ssthresh=91990 window_clamp=96256 rcv_wnd=66560 
<...>-2192  [005]   162.451998: tcp_rcvbuf_grow: time=106 rtt_us=105 copied=158720 inq=0 space=46373 ooo=0 scaling_ratio=188 rcvbuf=131072 rcv_ssthresh=91990 window_clamp=96256 rcv_wnd=92160 
<...>-2192  [005]   162.453254: tcp_rcvbuf_grow: time=142 rtt_us=44 copied=292496 inq=91512 space=158720 ooo=0 scaling_ratio=188 rcvbuf=432258 rcv_ssthresh=270533 window_clamp=317439 rcv_wnd=253952 
<...>-2192  [005]   162.454446: tcp_rcvbuf_grow: time=113 rtt_us=44 copied=343176 inq=127424 space=200984 ooo=0 scaling_ratio=188 rcvbuf=547360 rcv_ssthresh=349656 window_clamp=401967 rcv_wnd=345088 
<...>-2192  [005]   162.455726: tcp_rcvbuf_grow: time=52 rtt_us=44 copied=264464 inq=40544 space=215752 ooo=0 scaling_ratio=188 rcvbuf=587579 rcv_ssthresh=391036 window_clamp=431503 rcv_wnd=194560 
<...>-2192  [005]   162.456444: tcp_rcvbuf_grow: time=37 rtt_us=36 copied=322560 inq=0 space=223920 ooo=0 scaling_ratio=188 rcvbuf=609824 rcv_ssthresh=391036 window_clamp=447839 rcv_wnd=323584 
<...>-2192  [005]   162.456865: tcp_rcvbuf_grow: time=40 rtt_us=36 copied=421840 inq=73848 space=322560 ooo=0 scaling_ratio=188 rcvbuf=878461 rcv_ssthresh=581105 window_clamp=645119 rcv_wnd=515072 
<...>-2192  [005]   162.457762: tcp_rcvbuf_grow: time=38 rtt_us=36 copied=430176 inq=65160 space=347992 ooo=0 scaling_ratio=188 rcvbuf=947722 rcv_ssthresh=631969 window_clamp=695983 rcv_wnd=467968 
<...>-2192  [005]   162.463191: tcp_rcvbuf_grow: time=35 rtt_us=34 copied=411336 inq=0 space=365016 ooo=0 scaling_ratio=188 rcvbuf=994086 rcv_ssthresh=666017 window_clamp=730031 rcv_wnd=354304 
<...>-2192  [005]   162.469069: tcp_rcvbuf_grow: time=38 rtt_us=34 copied=444520 inq=0 space=411336 ooo=0 scaling_ratio=188 rcvbuf=1120234 rcv_ssthresh=783379 window_clamp=822671 rcv_wnd=679936 

After series:
<...>-2585  [005]  1061.768676: tcp_rcvbuf_grow: time=623 rtt_us=600 copied=72437 inq=28960 space=14480 ooo=0 scaling_ratio=188 rcvbuf=131072 rcv_ssthresh=81968 window_clamp=96256 rcv_wnd=82944 
<...>-2585  [005]  1061.769859: tcp_rcvbuf_grow: time=89 rtt_us=55 copied=250560 inq=46336 space=43477 ooo=0 scaling_ratio=188 rcvbuf=592631 rcv_ssthresh=302062 window_clamp=435213 rcv_wnd=230400 
<...>-2585  [005]  1061.775618: tcp_rcvbuf_grow: time=56 rtt_us=55 copied=405296 inq=140016 space=204224 ooo=0 scaling_ratio=188 rcvbuf=4668930 rcv_ssthresh=1927847 window_clamp=3428745 rcv_wnd=1928192 
<...>-2585  [005]  1061.777324: tcp_rcvbuf_grow: time=57 rtt_us=55 copied=450664 inq=131072 space=265280 ooo=0 scaling_ratio=188 rcvbuf=4668930 rcv_ssthresh=3106743 window_clamp=3428745 rcv_wnd=3006464 
<...>-2585  [005]  1061.783411: tcp_rcvbuf_grow: time=58 rtt_us=55 copied=521280 inq=41160 space=319592 ooo=0 scaling_ratio=188 rcvbuf=4668930 rcv_ssthresh=3364731 window_clamp=3428745 rcv_wnd=2086912 
<...>-2585  [005]  1061.790393: tcp_rcvbuf_grow: time=55 rtt_us=55 copied=524288 inq=0 space=480120 ooo=0 scaling_ratio=188 rcvbuf=4668930 rcv_ssthresh=3364731 window_clamp=3428745 rcv_wnd=2492416 
<...>-2585  [005]  1061.935387: tcp_rcvbuf_grow: time=55 rtt_us=55 copied=537824 inq=0 space=524288 ooo=0 scaling_ratio=188 rcvbuf=4668930 rcv_ssthresh=3364731 window_clamp=3428745 rcv_wnd=2258944 
<...>-2585  [005]  1062.977374: tcp_rcvbuf_grow: time=57 rtt_us=55 copied=545064 inq=0 space=537824 ooo=0 scaling_ratio=188 rcvbuf=4668930 rcv_ssthresh=3428745 window_clamp=3428745 rcv_wnd=2223104 
<...>-2585  [005]  1064.873376: tcp_rcvbuf_grow: time=57 rtt_us=55 copied=549408 inq=0 space=545064 ooo=0 scaling_ratio=188 rcvbuf=4668930 rcv_ssthresh=3428745 window_clamp=3428745 rcv_wnd=2509824 
<...>-2585  [005]  1065.984340: tcp_rcvbuf_grow: time=59 rtt_us=55 copied=574024 inq=0 space=549408 ooo=0 scaling_ratio=188 rcvbuf=4668930 rcv_ssthresh=3428745 window_clamp=3428745 rcv_wnd=2336768 
<...>-2585  [005]  1066.210718: tcp_rcvbuf_grow: time=410 rtt_us=55 copied=589448 inq=0 space=574024 ooo=0 scaling_ratio=188 rcvbuf=4668930 rcv_ssthresh=3428745 window_clamp=3428745 rcv_wnd=3364864 

Is this expected?

Thanks,
Dragos

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ