[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CADVnQy=u4zQjC6A9WBvnZOvRFAVzi=0rSzbSs_4kujB2u0U2EQ@mail.gmail.com>
Date: Tue, 3 Feb 2026 09:29:09 -0500
From: Neal Cardwell <ncardwell@...gle.com>
To: Kathrin Elmenhorst <kelmenhorst@...-osnabrueck.de>
Cc: Eric Dumazet <edumazet@...gle.com>, Kuniyuki Iwashima <kuniyu@...gle.com>, netdev@...r.kernel.org
Subject: Re: [PATCH net-next] net: tcp_bbr: use high pacing gain when the
sender fails to put enough data inflight
On Tue, Feb 3, 2026 at 3:36 AM Kathrin Elmenhorst
<kelmenhorst@...-osnabrueck.de> wrote:
>
> > AFAICT this patch does not look like a safe solution, because there
> > are many reasons that the actual number of bytes in flight can be less
> > than the target inflight. Notably, it is very common for applications
> > to be application-limited, i.e., they don't have enough data to send
> > to fully utilize the BDP of the network path. This is very common for
> > the most common kinds of TCP workloads: web, streaming video, RPC,
> > SSH, etc. It does not seem safe to increase the pacing gain to
> > bbr_high_gain in these common application-limited scenarios, because
> > this can cause bursts of data to arrive at the bottleneck link at more
> > than twice the available bandwidth, which can cause very high queuing
> > and packet loss.
>
> Absolutely, app-limited and contention-limited sockets exhibit similar
> characteristics in terms of inflight bytes. My understanding was that
> BBR uses high pacing gain when the socket is app-limited anyways, is
> that not correct?
> For example, bbr_check_full_bw_reached returns early if the socket is
> app-limited, so that BBR stays in (or is reset to) STARTUP mode with
> high pacing gain.
The "or is reset to" part is an incorrect reading of the code; BBR
does not reset itself to STARTUP when the connection is app-limited.
:-)
> This behavior actually gave me the idea to fix the
> contention-limited socket in a similar way. But maybe I am missing
> other cases where BBR reacts differently to app-limitation.
Please note that in bbr_cwnd_event() when an app-limited connection in
BBR_PROBE_BW restarts from idle, BBR sets the pacing rate to 1.0x the
estimated bandwidth. This is the common case for long-lived BBR
connections with application-limited behavior. Your proposed patch
makes the behavior much more aggressive in this case.
> > But if you have further results that you can share, I'd appreciate it.
>
> I separately sent you the arXiv link to our paper. Due to a conference
> submission policy, we are not allowed to advertise it on a public
> mailing list yet.
Thanks for the pointer, and thanks for the nice paper with those
detailed experiments. I agree that this is a worthwhile problem to
solve, but my sense is that we will need a solution that is more
narrowly targeted than increasing the pacing gain to bbr_high_gain any
time the actual number of bytes in flight can be less than the target
inflight.
I suspect we want something that builds on the following patches by Eric:
commit fefa569a9d4bc4b7758c0fddd75bb0382c95da77
Author: Eric Dumazet <edumazet@...gle.com>
Date: Thu Sep 22 08:58:55 2016 -0700
net_sched: sch_fq: account for schedule/timers drifts
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=
fefa569a9d4bc4b7758c0fddd75bb0382c95da77
commit a7a2563064e963bc5e3f39f533974f2730c0ff56
Author: Eric Dumazet <edumazet@...gle.com>
Date: Mon Oct 15 09:37:54 2018 -0700
tcp: mitigate scheduling jitter in EDT pacing model
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a7a2563064e963bc5e3f39f533974f2730c0ff56
With both of those patches, AFAIK the basic idea is to adaptively
measure how far behind the intended pacing rate the actual pacing rate
is running, and adjust the pacing delays to make the actual pacing
rate match the intended pacing rate more closely.
This kind of approach:
(a) should more narrowly target cases where the actual pacing rate can
fall far below the intended pacing rate (cases like low-CPU-budget
VMs)
(b) should help all CCs doing pacing; not just BBR (e.g., paced CUBIC)
For example, what happens if you change the line in
tcp_update_skb_after_send() that says:
len_ns -= min_t(u64, len_ns / 2, credit);
to say:
len_ns -= min_t(u64, len_ns, credit);
Does that help BBR performance in these low-CPU-budget VMs?
Related questions would be: for VMs using paced CUBIC (CUBIC with fq
qdisc): (a) how much does paced CUBIC suffer on low-CPU-budget VMs?
(b) how much does this alternate len_ns computation help paced CUBIC
on such VMs?
best regards,
neal
Powered by blists - more mailing lists