[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200609140934.110785-1-willemdebruijn.kernel@gmail.com>
Date: Tue, 9 Jun 2020 10:09:28 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: netdev@...r.kernel.org
Cc: Willem de Bruijn <willemb@...gle.com>
Subject: [PATCH RFC net-next 0/6] multi release pacing for UDP GSO
From: Willem de Bruijn <willemb@...gle.com>
UDP segmentation offload with UDP_SEGMENT can significantly reduce the
transmission cycle cost per byte for protocols like QUIC.
Pacing offload with SO_TXTIME can improve accuracy and cycle cost of
pacing for such userspace protocols further.
But the maximum GSO size built is limited by the pacing rate. As msec
pacing interval, for many Internet clients results in at most a few
segments per datagram.
The pros and cons were captured in a recent CloudFlare article,
specifically mentioning
"But it does not yet support specifying different times for each
packet when GSO is used, as there is no way to define multiple
timestamps for packets that need to be segmented (each segmented
packet essentially ends up being sent at the same time anyway)."
https://blog.cloudflare.com/accelerating-udp-packet-transmission-for-quic/
We have been evaluating such a mechanism for multiple release times
per UDP GSO packets. Since it sounds like it may of interest to
others, too, it may be a while before we have all the data I'd like
and it's more quiet on the list now that the merge window is open,
sharing a WIP version.
The basic approach is to specify
1. initial early release time (in nsec)
2. interval between subsequent release times (in msec)
3. number of segments to release at each release time
One implementation concern is where to store the additional two fields
in the skb. Given that msec granularity is the Internet pacing speed,
for now repurpose the two lowest 4B nibbles in skb->tstamp to hold the
interval and segment count. I'm aware that this does not win a prize
for elegance.
Patch 1 adds the socket option and basic segmentation function to
adjust the skb->tstamp of the individual segments.
Patch 2 extends this with support for build GSO segs. Build one GSO
segment per interval if the hardware can offload (USO) and thus
we are segmenting only to maintain pacing rate.
Patch 3 wires the segmentation up to the FQ qdisc on enqueue, so that
segments will be scheduled for delivery at their adjusted time.
Patch 4..6 extend existing tests to experiment with the feature
Patch 4 allows testing so_txtime across hardware (for USO)
Patch 5 extends the so_txtime test with support for gso and mr-pacing
Patch 6 extends the udpgso bench to support pacing and mr-pacing
Some known limitations:
- the aforementioned storage in skb->tstamp.
- exposing this constraint through the SO_TXTIME interface.
it is cleaner to add new fields to the cmsg, at nsec resolution.
- the fq_enqueue path adds a branch to the hot path.
a static branch would avoid that.
- a few udp specific assumptions in a net/core datapath.
notably the hw_features. this can be derived from gso_type.
Willem de Bruijn (6):
net: multiple release time SO_TXTIME
net: build gso segs in multi release time SO_TXTIME
net_sched: sch_fq: multiple release time support
selftests/net: so_txtime: support txonly/rxonly modes
selftests/net: so_txtime: add gso and multi release pacing
selftests/net: upgso bench: add pacing with SO_TXTIME
include/linux/netdevice.h | 1 +
include/net/sock.h | 3 +-
include/uapi/linux/net_tstamp.h | 3 +-
net/core/dev.c | 71 +++++++++
net/core/sock.c | 4 +
net/sched/sch_fq.c | 33 ++++-
tools/testing/selftests/net/so_txtime.c | 136 ++++++++++++++----
tools/testing/selftests/net/so_txtime.sh | 7 +
.../testing/selftests/net/so_txtime_multi.sh | 68 +++++++++
.../selftests/net/udpgso_bench_multi.sh | 65 +++++++++
tools/testing/selftests/net/udpgso_bench_tx.c | 72 +++++++++-
11 files changed, 431 insertions(+), 32 deletions(-)
create mode 100755 tools/testing/selftests/net/so_txtime_multi.sh
create mode 100755 tools/testing/selftests/net/udpgso_bench_multi.sh
--
2.27.0.278.ge193c7cf3a9-goog
Powered by blists - more mailing lists