[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220729143935.2432743-1-marek@cloudflare.com>
Date: Fri, 29 Jul 2022 16:39:33 +0200
From: Marek Majkowski <marek@...udflare.com>
To: netdev@...r.kernel.org
Cc: bpf@...r.kernel.org, kernel-team@...udflare.com,
ivan@...udflare.com, edumazet@...gle.com, davem@...emloft.net,
kuba@...nel.org, pabeni@...hat.com, ast@...nel.org,
daniel@...earbox.net, andrii@...nel.org, brakmo@...com,
Marek Majkowski <marek@...udflare.com>
Subject: [PATCH net-next v2 0/2] RTAX_INITRWND should be able to bring the rcv_ssthresh above 64KiB
Among many route options we support initrwnd/RTAX_INITRWND path
attribute:
$ ip route change local 127.0.0.0/8 dev lo initrwnd 1024
This sets the initial receive window size (in packets). However, it's
not very useful in practice. For smaller buffers (<128KiB) it can be
used to bring the initial receive window down, but it's hard to
imagine when this is useful. The same effect can be achieved with
TCP_WINDOW_CLAMP / RTAX_WINDOW option.
For larger buffers (>128KiB) the initial receive window is usually
limited by rcv_ssthresh, which starts at 64KiB. The initrwnd option
can't bring the window above it, which limits its usefulness
This patch changes that. Now, by setting RTAX_INITRWND path attribute
we bring up the initial rcv_ssthresh in line with the initrwnd
value. This allows to increase the initial advertised receive window
instantly, after first TCP RTT, above 64KiB.
With this change, the administrator can configure a route (or skops
ebpf program) where the receive window is opened much faster than
usual. This is useful on big BDP connections - large latency, high
throughput - where it takes much time to fully open the receive
window, due to the usual rcv_ssthresh cap.
However, this feature should be used with caution. It only makes sense
to employ it in limited circumstances:
* When using high-bandwidth TCP transfers over big-latency links.
* When the truesize of the flow/NIC is sensible and predictable.
* When the application is ready to send a lot of data immediately
after flow is established.
* When the sender has configured larger than usual `initcwnd`.
* When optimizing for every possible RTT.
This patch is related to previous work by Ivan Babrou:
https://lore.kernel.org/bpf/CAA93jw5+LjKLcCaNr5wJGPrXhbjvLhts8hqpKPFx7JeWG4g0AA@mail.gmail.com/T/
Please note that due to TCP wscale semantics, the TCP sender will need
to receive first ACK to be informed of the large opened receive
window. That is: the large window is advertised only in the first ACK
from the peer. When the TCP client has large window, it is advertised
in the third-packet (ACK) of the handshake. When the TCP sever has
large window, it is advertised only in the first ACK after some data
has been received.
Syncookie support will be provided in subsequent patchet, since it
requires more changes.
*** BLURB HERE ***
Marek Majkowski (2):
RTAX_INITRWND should be able to set the rcv_ssthresh above 64KiB
Tests for RTAX_INITRWND
include/linux/tcp.h | 1 +
net/ipv4/tcp_minisocks.c | 9 +-
net/ipv4/tcp_output.c | 7 +-
.../selftests/bpf/prog_tests/tcp_initrwnd.c | 420 ++++++++++++++++++
.../selftests/bpf/progs/test_tcp_initrwnd.c | 30 ++
5 files changed, 463 insertions(+), 4 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/tcp_initrwnd.c
create mode 100644 tools/testing/selftests/bpf/progs/test_tcp_initrwnd.c
--
2.25.1
Powered by blists - more mailing lists