lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f7bae32-76e3-4f63-bcb8-89f6aaabc0e1@quicinc.com>
Date: Sat, 18 May 2024 20:13:48 -0600
From: "Subash Abhinov Kasiviswanathan (KS)" <quic_subashab@...cinc.com>
To: Eric Dumazet <edumazet@...gle.com>
CC: <soheil@...gle.com>, <ncardwell@...gle.com>, <yyd@...gle.com>,
        <ycheng@...gle.com>, <quic_stranche@...cinc.com>,
        <davem@...emloft.net>, <kuba@...nel.org>, <netdev@...r.kernel.org>
Subject: Re: Potential impact of commit dfa2f0483360 ("tcp: get rid of
 sysctl_tcp_adv_win_scale")



On 5/17/2024 1:08 AM, Subash Abhinov Kasiviswanathan (KS) wrote:
> 
> 
> On 5/16/2024 12:49 PM, Subash Abhinov Kasiviswanathan (KS) wrote:
>> On 5/16/2024 2:31 AM, Eric Dumazet wrote:
>>> On Thu, May 16, 2024 at 9:57 AM Eric Dumazet <edumazet@...gle.com> 
>>> wrote:
>>>>
>>>> On Thu, May 16, 2024 at 9:16 AM Subash Abhinov Kasiviswanathan (KS)
>>>> <quic_subashab@...cinc.com> wrote:
>>>>>
>>>>> On 5/15/2024 11:36 PM, Eric Dumazet wrote:
>>>>>> On Thu, May 16, 2024 at 4:32 AM Subash Abhinov Kasiviswanathan (KS)
>>>>>> <quic_subashab@...cinc.com> wrote:
>>>>>>>
>>>>>>> On 5/15/2024 1:10 AM, Eric Dumazet wrote:
>>>>>>>> On Wed, May 15, 2024 at 6:47 AM Subash Abhinov Kasiviswanathan (KS)
>>>>>>>> <quic_subashab@...cinc.com> wrote:
>>>>>>>>>
>>>>>>>>> We recently noticed that a device running a 6.6.17 kernel (A) 
>>>>>>>>> was having
>>>>>>>>> a slower single stream download speed compared to a device running
>>>>>>>>> 6.1.57 kernel (B). The test here is over mobile radio with 
>>>>>>>>> iperf3 with
>>>>>>>>> window size 4M from a third party server.
>>>>>>>>
>>>>>>
>>> This is not fixable easily, because tp->window_clamp has been
>>> historically abused.
>>>
>>> TCP_WINDOW_CLAMP socket option should have used a separate tcp socket 
>>> field
>>> to remember tp->window_clamp has been set (fixed) to a user value.
>>>
>>> Make sure you have this followup patch, dealing with applications
>>> still needing to make TCP slow.
>>>
>>> commit 697a6c8cec03c2299f850fa50322641a8bf6b915
>>> Author: Hechao Li <hli@...flix.com>
>>> Date:   Tue Apr 9 09:43:55 2024 -0700
>>>
>>>      tcp: increase the default TCP scaling ratio
>>>> What happens if you let autotuning enabled ?
>> I'll try this test and also the test with 4M SO_RCVBUF on the device 
>> configuration where the download issue was observed and report back 
>> with the findings.
> With autotuning, the receiver window scaled to ~9M. The download speed 
> matched whatever I got with setting SO_RCVBUF 16M on A earlier (which 
> aligns with previous observation as the window scaled to ~8M without the 
> commit).
> 
> With 4M SO_RCVBUF, the receiver window scaled to ~4M. Download speed 
> increased significantly but didn't match the download speed of B with 4M 
> SO_RCVBUF. Per commit description, the commit matches the behavior as if 
> tcp_adv_win_scale was set to 1.
> 
> Download speed of B is higher than A for 4M SO_RCVBUF as receiver window 
> of B grew to ~6M. This is because B had tcp_adv_win_scale set to 2.
Would the following to change to re-enable the use of sysctl 
tcp_adv_win_scale to set the initial scaling ratio be acceptable. 
Default value of tcp_adv_win_scale is 1 which corresponds to the 
existing 50% ratio.

I verified with this patch on A that setting SO_RCVBUF 4M in iperf3 with 
tcp_adv_win_scale = 1 (default) scales receiver window to ~4M while 
tcp_adv_win_scale = 2 scales receiver window to ~6M (which matches the 
behavior from B).

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 618f991cb336..1bca7d2e47c8 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1460,14 +1460,23 @@ static inline int tcp_space_from_win(const 
struct sock *sk, int win)
         return __tcp_space_from_win(tcp_sk(sk)->scaling_ratio, win);
  }

-/* Assume a 50% default for skb->len/skb->truesize ratio.
- * This may be adjusted later in tcp_measure_rcv_mss().
- */
-#define TCP_DEFAULT_SCALING_RATIO (1 << (TCP_RMEM_TO_WIN_SCALE - 1))
-
  static inline void tcp_scaling_ratio_init(struct sock *sk)
  {
-       tcp_sk(sk)->scaling_ratio = TCP_DEFAULT_SCALING_RATIO;
+       int win_scale = 
READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_adv_win_scale);
+
+       if (win_scale <= 0) {
+               if (win_scale < -TCP_RMEM_TO_WIN_SCALE)
+                       win_scale = -TCP_RMEM_TO_WIN_SCALE;
+
+               tcp_sk(sk)->scaling_ratio =
+                       1 << (TCP_RMEM_TO_WIN_SCALE + win_scale);
+       } else {
+               if (win_scale > TCP_RMEM_TO_WIN_SCALE)
+                       win_scale = TCP_RMEM_TO_WIN_SCALE;
+
+               tcp_sk(sk)->scaling_ratio = U8_MAX -
+                       (1 << (TCP_RMEM_TO_WIN_SCALE - win_scale));
+       }
  }

  /* Note: caller must be prepared to deal with negative returns */

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ