[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f7bae32-76e3-4f63-bcb8-89f6aaabc0e1@quicinc.com>
Date: Sat, 18 May 2024 20:13:48 -0600
From: "Subash Abhinov Kasiviswanathan (KS)" <quic_subashab@...cinc.com>
To: Eric Dumazet <edumazet@...gle.com>
CC: <soheil@...gle.com>, <ncardwell@...gle.com>, <yyd@...gle.com>,
<ycheng@...gle.com>, <quic_stranche@...cinc.com>,
<davem@...emloft.net>, <kuba@...nel.org>, <netdev@...r.kernel.org>
Subject: Re: Potential impact of commit dfa2f0483360 ("tcp: get rid of
sysctl_tcp_adv_win_scale")
On 5/17/2024 1:08 AM, Subash Abhinov Kasiviswanathan (KS) wrote:
>
>
> On 5/16/2024 12:49 PM, Subash Abhinov Kasiviswanathan (KS) wrote:
>> On 5/16/2024 2:31 AM, Eric Dumazet wrote:
>>> On Thu, May 16, 2024 at 9:57 AM Eric Dumazet <edumazet@...gle.com>
>>> wrote:
>>>>
>>>> On Thu, May 16, 2024 at 9:16 AM Subash Abhinov Kasiviswanathan (KS)
>>>> <quic_subashab@...cinc.com> wrote:
>>>>>
>>>>> On 5/15/2024 11:36 PM, Eric Dumazet wrote:
>>>>>> On Thu, May 16, 2024 at 4:32 AM Subash Abhinov Kasiviswanathan (KS)
>>>>>> <quic_subashab@...cinc.com> wrote:
>>>>>>>
>>>>>>> On 5/15/2024 1:10 AM, Eric Dumazet wrote:
>>>>>>>> On Wed, May 15, 2024 at 6:47 AM Subash Abhinov Kasiviswanathan (KS)
>>>>>>>> <quic_subashab@...cinc.com> wrote:
>>>>>>>>>
>>>>>>>>> We recently noticed that a device running a 6.6.17 kernel (A)
>>>>>>>>> was having
>>>>>>>>> a slower single stream download speed compared to a device running
>>>>>>>>> 6.1.57 kernel (B). The test here is over mobile radio with
>>>>>>>>> iperf3 with
>>>>>>>>> window size 4M from a third party server.
>>>>>>>>
>>>>>>
>>> This is not fixable easily, because tp->window_clamp has been
>>> historically abused.
>>>
>>> TCP_WINDOW_CLAMP socket option should have used a separate tcp socket
>>> field
>>> to remember tp->window_clamp has been set (fixed) to a user value.
>>>
>>> Make sure you have this followup patch, dealing with applications
>>> still needing to make TCP slow.
>>>
>>> commit 697a6c8cec03c2299f850fa50322641a8bf6b915
>>> Author: Hechao Li <hli@...flix.com>
>>> Date: Tue Apr 9 09:43:55 2024 -0700
>>>
>>> tcp: increase the default TCP scaling ratio
>>>> What happens if you let autotuning enabled ?
>> I'll try this test and also the test with 4M SO_RCVBUF on the device
>> configuration where the download issue was observed and report back
>> with the findings.
> With autotuning, the receiver window scaled to ~9M. The download speed
> matched whatever I got with setting SO_RCVBUF 16M on A earlier (which
> aligns with previous observation as the window scaled to ~8M without the
> commit).
>
> With 4M SO_RCVBUF, the receiver window scaled to ~4M. Download speed
> increased significantly but didn't match the download speed of B with 4M
> SO_RCVBUF. Per commit description, the commit matches the behavior as if
> tcp_adv_win_scale was set to 1.
>
> Download speed of B is higher than A for 4M SO_RCVBUF as receiver window
> of B grew to ~6M. This is because B had tcp_adv_win_scale set to 2.
Would the following to change to re-enable the use of sysctl
tcp_adv_win_scale to set the initial scaling ratio be acceptable.
Default value of tcp_adv_win_scale is 1 which corresponds to the
existing 50% ratio.
I verified with this patch on A that setting SO_RCVBUF 4M in iperf3 with
tcp_adv_win_scale = 1 (default) scales receiver window to ~4M while
tcp_adv_win_scale = 2 scales receiver window to ~6M (which matches the
behavior from B).
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 618f991cb336..1bca7d2e47c8 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1460,14 +1460,23 @@ static inline int tcp_space_from_win(const
struct sock *sk, int win)
return __tcp_space_from_win(tcp_sk(sk)->scaling_ratio, win);
}
-/* Assume a 50% default for skb->len/skb->truesize ratio.
- * This may be adjusted later in tcp_measure_rcv_mss().
- */
-#define TCP_DEFAULT_SCALING_RATIO (1 << (TCP_RMEM_TO_WIN_SCALE - 1))
-
static inline void tcp_scaling_ratio_init(struct sock *sk)
{
- tcp_sk(sk)->scaling_ratio = TCP_DEFAULT_SCALING_RATIO;
+ int win_scale =
READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_adv_win_scale);
+
+ if (win_scale <= 0) {
+ if (win_scale < -TCP_RMEM_TO_WIN_SCALE)
+ win_scale = -TCP_RMEM_TO_WIN_SCALE;
+
+ tcp_sk(sk)->scaling_ratio =
+ 1 << (TCP_RMEM_TO_WIN_SCALE + win_scale);
+ } else {
+ if (win_scale > TCP_RMEM_TO_WIN_SCALE)
+ win_scale = TCP_RMEM_TO_WIN_SCALE;
+
+ tcp_sk(sk)->scaling_ratio = U8_MAX -
+ (1 << (TCP_RMEM_TO_WIN_SCALE - win_scale));
+ }
}
/* Note: caller must be prepared to deal with negative returns */
Powered by blists - more mailing lists