[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLzo0Wk7p=dtUQ4Q2-pCAsjSxXZw71ngNTw6NZbEEvoDA@mail.gmail.com>
Date: Mon, 6 Jan 2025 09:24:35 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Mahdi Arghavani <ma.arghavani@...oo.com>, Neal Cardwell <ncardwell@...gle.com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>, Haibo Zhang <haibo.zhang@...go.ac.nz>,
David Eyers <david.eyers@...go.ac.nz>, Abbas Arghavani <abbas.arghavani@....se>
Subject: Re: [PATCH net] tcp_cubic: Fix for bug in HyStart implementation in
the Linux kernel
On Mon, Jan 6, 2025 at 5:53 AM Mahdi Arghavani <ma.arghavani@...oo.com> wrote:
>
> Hi,
>
> While refining the source code for our project (SUSS), I discovered a bug in the implementation of HyStart in the Linux kernel, starting from version v5.15.6. The issue, caused by incorrect marking of round starts, results in inaccurate measurement of the length of each ACK train. Since HyStart relies on the length of ACK trains as one of two key criteria to stop exponential cwnd growth during Slow-Start, this inaccuracy renders the criterion ineffective, potentially degrading TCP performance.
>
Hi Mahdi
netdev@ mailing list does not accept HTML messages.
Am I right to say you are referring to
commit 8165a96f6b7122f25bf809aecf06f17b0ec37b58
Author: Eric Dumazet <edumazet@...gle.com>
Date: Tue Nov 23 12:25:35 2021 -0800
tcp_cubic: fix spurious Hystart ACK train detections for
not-cwnd-limited flows
[ Upstream commit 4e1fddc98d2585ddd4792b5e44433dcee7ece001 ]
> Issue Description: The problem arises because the hystart_reset function is not called upon receiving the first ACK (when cwnd=iw=10, see the attached figure). Instead, its invocation is delayed until the condition cwnd >= hystart_low_window is satisfied. In each round, this delay causes:
>
> 1) A postponed marking of the start of a new round.
>
> 2) An incorrect update of ca->end_seq, leading to incorrect marking of the subsequent round.
>
> As a result, the ACK train length is underestimated, which adversely affects HyStart’s first criterion for stopping cwnd exponential growth.
>
> Proposed Solution: Below is a tested patch that addresses the issue by ensuring hystart_reset is triggered appropriately:
>
Please provide a packetdrill test, this will be the most efficient way
to demonstrate the issue.
In general, ACK trains detection is not useful in modern networks,
because of TSO and GRO.
Reference : https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ede656e8465839530c3287c7f54adf75dc2b9563
Note that we are still waiting for an HyStart++ implementation for linux,
you may be interested in working on it.
Thank you.
>
>
> diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
>
> index 5dbed91c6178..78d9cf493ace 100644
>
> --- a/net/ipv4/tcp_cubic.c
>
> +++ b/net/ipv4/tcp_cubic.c
>
> @@ -392,6 +392,9 @@ static void hystart_update(struct sock *sk, u32 delay)
>
> if (after(tp->snd_una, ca->end_seq))
>
> bictcp_hystart_reset(sk);
>
>
>
> + if (tcp_snd_cwnd(tp) < hystart_low_window)
>
> + return;
>
> +
>
> if (hystart_detect & HYSTART_ACK_TRAIN) {
>
> u32 now = bictcp_clock_us(sk);
>
>
>
> @@ -468,8 +471,7 @@ __bpf_kfunc static void cubictcp_acked(struct sock *sk, const struct ack_sample
>
> ca->delay_min = delay;
>
>
>
> /* hystart triggers when cwnd is larger than some threshold */
>
> - if (!ca->found && tcp_in_slow_start(tp) && hystart &&
>
> - tcp_snd_cwnd(tp) >= hystart_low_window)
>
> + if (!ca->found && tcp_in_slow_start(tp) && hystart)
>
> hystart_update(sk, delay);
>
> }
>
> Best wishes,
> Mahdi Arghavani
Powered by blists - more mailing lists