lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANn89iJUyBfXeSQr_QZWaQP58ZO_1c6hMe7F15sq=qYsa=TyTA@mail.gmail.com>
Date: Mon, 20 May 2024 19:42:59 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: "Subash Abhinov Kasiviswanathan (KS)" <quic_subashab@...cinc.com>
Cc: soheil@...gle.com, ncardwell@...gle.com, yyd@...gle.com, ycheng@...gle.com, 
	quic_stranche@...cinc.com, davem@...emloft.net, kuba@...nel.org, 
	netdev@...r.kernel.org
Subject: Re: Potential impact of commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale")

On Mon, May 20, 2024 at 7:33 PM Subash Abhinov Kasiviswanathan (KS)
<quic_subashab@...cinc.com> wrote:
>
> On 5/20/2024 11:20 AM, Eric Dumazet wrote:
> > On Mon, May 20, 2024 at 7:09 PM Subash Abhinov Kasiviswanathan (KS)
> > <quic_subashab@...cinc.com> wrote:
> >>
> >> On 5/20/2024 9:12 AM, Eric Dumazet wrote:
> >>> On Sun, May 19, 2024 at 4:14 AM Subash Abhinov Kasiviswanathan (KS)
> >>> <quic_subashab@...cinc.com> wrote:
> >>>>>>>>>>>>> We recently noticed that a device running a 6.6.17 kernel (A)
> >>>>>>>>>>>>> was having
> >>>>>>>>>>>>> a slower single stream download speed compared to a device running
> >>>>>>>>>>>>> 6.1.57 kernel (B). The test here is over mobile radio with
> >>>>>>>>>>>>> iperf3 with
> >>>>>>>>>>>>> window size 4M from a third party server.
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>> This is not fixable easily, because tp->window_clamp has been
> >>>>>>> historically abused.
> >>>>>>>
> >>>>>>> TCP_WINDOW_CLAMP socket option should have used a separate tcp socket
> >>>>>>> field
> >>>>>>> to remember tp->window_clamp has been set (fixed) to a user value.
> >>>>>>>
> >>>>>>> Make sure you have this followup patch, dealing with applications
> >>>>>>> still needing to make TCP slow.
> >>>>>>>
> >>>>>>> commit 697a6c8cec03c2299f850fa50322641a8bf6b915
> >>>>>>> Author: Hechao Li <hli@...flix.com>
> >>>>>>> Date:   Tue Apr 9 09:43:55 2024 -0700
> >>>>>>>
> >>>>>>>        tcp: increase the default TCP scaling ratio
> >>>>> With 4M SO_RCVBUF, the receiver window scaled to ~4M. Download speed
> >>>>> increased significantly but didn't match the download speed of B with 4M
> >>>>> SO_RCVBUF. Per commit description, the commit matches the behavior as if
> >>>>> tcp_adv_win_scale was set to 1.
> >>>>>
> >>>>> Download speed of B is higher than A for 4M SO_RCVBUF as receiver window
> >>>>> of B grew to ~6M. This is because B had tcp_adv_win_scale set to 2.
> >>>> Would the following to change to re-enable the use of sysctl
> >>>> tcp_adv_win_scale to set the initial scaling ratio be acceptable.
> >>>> Default value of tcp_adv_win_scale is 1 which corresponds to the
> >>>> existing 50% ratio.
> >>>>
> >>>> I verified with this patch on A that setting SO_RCVBUF 4M in iperf3 with
> >>>> tcp_adv_win_scale = 1 (default) scales receiver window to ~4M while
> >>>> tcp_adv_win_scale = 2 scales receiver window to ~6M (which matches the
> >>>> behavior from B).
> >
> > I do not think we want to bring back a config option that has been
> > superseded by something
> > allowing a host to have multiple NIC, with different MTU, and multiple
> > TCP flows with various MSS.
> The default value still stays 1 and all of the accurate estimation of
> skb->len/skb->truesize still remains for all auto tuning users. I
> believe that should continue support all the configurations you mentioned.
>
> I merely want to add the flexibility for users which have been affected
> here due to lack of backwards compatibility (SO_RCVBUF with
> tcp_adv_win_scale value other than 1).

A sysctl is the old way, sorry.

If a fix is needed, it needs to be at the time the kernel can learn
the effective skb->len/skb->truesize ratio.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ