netdev - Re: [PATCH net-next 7/8] tcp: stronger sk

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iJRCW3VNsY3vZwurvh52diE+scUfZvwx5bg5Tuoa3L_TQ@mail.gmail.com>
Date: Thu, 18 Dec 2025 14:19:40 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Christian Ebner <c.ebner@...xmox.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, 
	Paolo Abeni <pabeni@...hat.com>, Neal Cardwell <ncardwell@...gle.com>, 
	Simon Horman <horms@...nel.org>, Kuniyuki Iwashima <kuniyu@...gle.com>, 
	Willem de Bruijn <willemb@...gle.com>, netdev@...r.kernel.org, eric.dumazet@...il.com, 
	lkolbe@...iuswillert.com
Subject: Re: [PATCH net-next 7/8] tcp: stronger sk_rcvbuf checks

On Thu, Dec 18, 2025 at 1:28 PM Christian Ebner <c.ebner@...xmox.com> wrote:
>
> Hi Eric,
>
> thank you for your reply!
>
> On 12/18/25 11:10 AM, Eric Dumazet wrote:
> > Can you give us (on receive side) : cat /proc/sys/net/ipv4/tcp_rmem
>
> Affected users report they have the respective kernels defaults set, so:
> - "4096 131072 6291456"  for v.617 builds
> - "4096 131072 33554432" with the bumped max value of 32M for v6.18 builds
>
> > It seems your application is enforcing a small SO_RCVBUF ?
>
> No, we can exclude that since the output of `ss -tim` show the default
> buffer size after connection being established and growing up to the max
> value during traffic (backups being performed).
>

The trace you provided seems to show a very different picture ?

[::ffff:10.xx.xx.aa]:8007
       [::ffff:10.xx.xx.bb]:55554
          skmem:(r0,rb7488,t0,tb332800,f0,w0,o0,bl0,d20) cubic
wscale:10,10 rto:201 rtt:0.085/0.015 ato:40 mss:8948 pmtu:9000
rcvmss:7168 advmss:8948 cwnd:10 bytes_sent:937478 bytes_acked:937478
bytes_received:1295747055 segs_out:301010 segs_in:162410
data_segs_out:1035 data_segs_in:161588 send 8.42Gbps lastsnd:3308
lastrcv:191 lastack:191 pacing_rate 16.7Gbps delivery_rate 2.74Gbps
delivered:1036 app_limited busy:437ms rcv_rtt:207.551 rcv_space:96242
rcv_ssthresh:903417 minrtt:0.049 rcv_ooopack:23 snd_wnd:142336 rcv_wnd:7168

rb7488 would suggest the application has played with a very small SO_RCVBUF,
or some memory allocation constraint (memcg ?)

> Might out-of-order packets and small (us scale) RTTs play a role?
> `ss` reports `rcv_ooopack` when stale, the great majority of users
> having MTU 9000 (default seems to reduce the likelihood of this
> happening as well).
>
> > I would take a look at
> >
> > ecfea98b7d0d tcp: add net.ipv4.tcp_rcvbuf_low_rtt
> > 416dd649f3aa tcp: add net.ipv4.tcp_comp_sack_rtt_percent
> > aa251c84636c tcp: fix too slow tcp_rcvbuf_grow() action
>
> Thanks a lot for the hints, we did already provide a test build with
> commit aa251c84636c cherry-picked on top of 6.17.11 to affected users,
> but they were still running into stale connections.
> So while this (and most likely the increased `tcp_rmem[2]` default)
> seems to reduce the likelihood of stalls occurring, it does not fix them.
>
> > After applying these patches, you can on the receiver :
> >
> > perf record -a -e tcp:tcp_rcvbuf_grow sleep 30 ; perf script
>
> We now provided test builds with mentioned commits cherry-picked as well
> and further asked for users to test with v6.18.1 stable.
>
> Let me get back to you with requested traces and test results.
>
> Best regards,
> Christian Ebner
>