netdev - Re: [PATCH net v4 2/3] tcp_cubic: fix to match Reno additive increment

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iK+d65eT3sP8Wo8cGb4a_39cDF_kHG=Fn5cmcv93gzBvg@mail.gmail.com>
Date: Tue, 20 Aug 2024 14:56:21 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Mingrui Zhang <mrzhang97@...il.com>
Cc: davem@...emloft.net, ncardwell@...gle.com, netdev@...r.kernel.org, 
	Lisong Xu <xu@....edu>
Subject: Re: [PATCH net v4 2/3] tcp_cubic: fix to match Reno additive increment

On Mon, Aug 19, 2024 at 11:03 PM Mingrui Zhang <mrzhang97@...il.com> wrote:
>
> On 8/19/24 03:22, Eric Dumazet wrote:
> > On Sat, Aug 17, 2024 at 6:35 PM Mingrui Zhang <mrzhang97@...il.com> wrote:
> >> The original code follows RFC 8312 (obsoleted CUBIC RFC).
> >>
> >> The patched code follows RFC 9438 (new CUBIC RFC):
> > Please give the precise location in the RFC (4.3 Reno-Friendly Region)
>
> Thank you, Eric,
> I will write it more clearly in the next version patch to submit.
>
> >
> >> "Once _W_est_ has grown to reach the _cwnd_ at the time of most
> >> recently setting _ssthresh_ -- that is, _W_est_ >= _cwnd_prior_ --
> >> the sender SHOULD set α__cubic_ to 1 to ensure that it can achieve
> >> the same congestion window increment rate as Reno, which uses AIMD
> >> (1,0.5)."
> >>
> >> Add new field 'cwnd_prior' in bictcp to hold cwnd before a loss event
> >>
> >> Fixes: 89b3d9aaf467 ("[TCP] cubic: precompute constants")
> > RFC 9438 is brand new, I think we should not backport this patch to
> > stable linux versions.
> >
> > This would target net-next, unless there is clear evidence that it is
> > absolutely safe.
>
> I agree with you that this patch would target net-next.
>
> > Note the existence of tools/testing/selftests/bpf/progs/bpf_cc_cubic.c
> > and tools/testing/selftests/bpf/progs/bpf_cubic.c
> >
> > If this patch was a fix, I presume we would need to fix these files ?
> In my understanding, the bpf_cubic.c and bpf_cc_cubic.c are not designed to create a fully equivalent version of tcp_cubic, but more focus on BPF logic testing usage.
> For example, the up-to-date bpf_cubic does not involve the changes in commit 9957b38b5e7a ("tcp_cubic: make hystart_ack_delay() aware of BIG TCP")
>
> Maybe we would ask BPF maintainers whether to fix these BPF files?

We try (as TCP maintainers) to keep
tools/testing/selftests/bpf/progs/bpf_cubic.c up to date with the
kernel C code.
Because _if_ someone is really using BPF based cubic, they should get
the fix eventually.

See for instance

commit 7d21d54d624777358ab6c7be7ff778808fef70ba
Author: Neal Cardwell <ncardwell@...gle.com>
Date:   Wed Jun 24 12:42:03 2020 -0400

    bpf: tcp: bpf_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT

    Apply the fix from:
     "tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT"
    to the BPF implementation of TCP CUBIC congestion control.

    Repeating the commit description here for completeness:

    Mirja Kuehlewind reported a bug in Linux TCP CUBIC Hystart, where
    Hystart HYSTART_DELAY mechanism can exit Slow Start spuriously on an
    ACK when the minimum rtt of a connection goes down. From inspection it
    is clear from the existing code that this could happen in an example
    like the following:

    o The first 8 RTT samples in a round trip are 150ms, resulting in a
      curr_rtt of 150ms and a delay_min of 150ms.

    o The 9th RTT sample is 100ms. The curr_rtt does not change after the
      first 8 samples, so curr_rtt remains 150ms. But delay_min can be
      lowered at any time, so delay_min falls to 100ms. The code executes
      the HYSTART_DELAY comparison between curr_rtt of 150ms and delay_min
      of 100ms, and the curr_rtt is declared far enough above delay_min to
      force a (spurious) exit of Slow start.

    The fix here is simple: allow every RTT sample in a round trip to
    lower the curr_rtt.

    Fixes: 6de4a9c430b5 ("bpf: tcp: Add bpf_cubic example")
    Reported-by: Mirja Kuehlewind <mirja.kuehlewind@...csson.com>
    Signed-off-by: Neal Cardwell <ncardwell@...gle.com>
    Signed-off-by: Eric Dumazet <edumazet@...gle.com>
    Acked-by: Soheil Hassas Yeganeh <soheil@...gle.com>
    Signed-off-by: David S. Miller <davem@...emloft.net>