[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250212061855.71154-1-kerneljasonxing@gmail.com>
Date: Wed, 12 Feb 2025 14:18:43 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: davem@...emloft.net,
edumazet@...gle.com,
kuba@...nel.org,
pabeni@...hat.com,
dsahern@...nel.org,
willemdebruijn.kernel@...il.com,
willemb@...gle.com,
ast@...nel.org,
daniel@...earbox.net,
andrii@...nel.org,
martin.lau@...ux.dev,
eddyz87@...il.com,
song@...nel.org,
yonghong.song@...ux.dev,
john.fastabend@...il.com,
kpsingh@...nel.org,
sdf@...ichev.me,
haoluo@...gle.com,
jolsa@...nel.org,
shuah@...nel.org,
ykolal@...com
Cc: bpf@...r.kernel.org,
netdev@...r.kernel.org,
Jason Xing <kerneljasonxing@...il.com>
Subject: [PATCH bpf-next v10 00/12] net-timestamp: bpf extension to equip applications transparently
"Timestamping is key to debugging network stack latency. With
SO_TIMESTAMPING, bugs that are otherwise incorrectly assumed to be
network issues can be attributed to the kernel." This is extracted
from the talk "SO_TIMESTAMPING: Powering Fleetwide RPC Monitoring"
addressed by Willem de Bruijn at netdevconf 0x17).
There are a few areas that need optimization with the consideration of
easier use and less performance impact, which I highlighted and mainly
discussed at netconf 2024 with Willem de Bruijn and John Fastabend:
uAPI compatibility, extra system call overhead, and the need for
application modification. I initially managed to solve these issues
by writing a kernel module that hooks various key functions. However,
this approach is not suitable for the next kernel release. Therefore,
a BPF extension was proposed. During recent period, Martin KaFai Lau
provides invaluable suggestions about BPF along the way. Many thanks
here!
This series adds the BPF networking timestamping infrastructure through
reusing most of the tx timestamping callback that is currently enabled
by the SO_TIMESTAMPING.. This series also adds TX timestamping support
for TCP. The RX timestamping and UDP support will be added in the future.
---
v9
Link: https://lore.kernel.org/all/20250208103220.72294-1-kerneljasonxing@gmail.com/
1. set the hwtstamp to skb when the skb enters into the hw SND case
2. fix co-existence problem in patch 9 and add corresponding check in
patch 12.
3. refine some commit messages and titles
v8
Link: https://lore.kernel.org/all/20250128084620.57547-1-kerneljasonxing@gmail.com/
1. adjust some commit messages and titles
2. add sk cookie in selftests
3. handle the NULL pointer in hwstamp
4. use kfunc to do selective sampling
v7
Link: https://lore.kernel.org/all/20250121012901.87763-1-kerneljasonxing@gmail.com/
1. target bpf-next tree
2. simplely and directly stop timestamping callbacks calling a few BPF
CALLS due to safety concern.
3. add more new testcases and adjust the existing testcases
4. revise some comments of new timestamping callbacks
5. remove a few BPF CGROUP locks
RFC v6
In the meantime, any suggestions and reviews are welcome!
Link: https://lore.kernel.org/all/20250112113748.73504-1-kerneljasonxing@gmail.com/
1. handle those safety problem by using the correct method.
2. support bpf_getsockopt.
3. adjust the position of BPF_SOCK_OPS_TS_TCP_SND_CB
4. fix mishandling the hardware timestamp error
5. add more corresponding tests
v5
Link: https://lore.kernel.org/all/20241207173803.90744-1-kerneljasonxing@gmail.com/
1. handle the safety issus when someone tries to call unrelated bpf
helpers.
2. avoid adding direct function call in the hot path like
__dev_queue_xmit()
3. remove reporting the hardware timestamp and tskey since they can be
fetched through the existing helper with the help of
bpf_skops_init_skb(), please see the selftest.
4. add new sendmsg callback in tcp_sendmsg, and introduce tskey_bpf used
by bpf program to correlate tcp_sendmsg with other hook points in patch [13/15].
v4
Link: https://lore.kernel.org/all/20241028110535.82999-1-kerneljasonxing@gmail.com/
1. introduce sk->sk_bpf_cb_flags to let user use bpf_setsockopt() (Martin)
2. introduce SKBTX_BPF to enable the bpf SO_TIMESTAMPING feature (Martin)
3. introduce bpf map in tests (Martin)
4. I choose to make this series as simple as possible, so I only support
most cases in the tx path for TCP protocol.
v3
Link: https://lore.kernel.org/all/20241012040651.95616-1-kerneljasonxing@gmail.com/
1. support UDP proto by introducing a new generation point.
2. for OPT_ID, introducing sk_tskey_bpf_offset to compute the delta
between the current socket key and bpf socket key. It is desiged for
UDP, which also applies to TCP.
3. support bpf_getsockopt()
4. use cgroup static key instead.
5. add one simple bpf selftest to show how it can be used.
6. remove the rx support from v2 because the number of patches could
exceed the limit of one series.
V2
Link: https://lore.kernel.org/all/20241008095109.99918-1-kerneljasonxing@gmail.com/
1. Introduce tsflag requestors so that we are able to extend more in the
future. Besides, it enables TX flags for bpf extension feature separately
without breaking users. It is suggested by Vadim Fedorenko.
2. introduce a static key to control the whole feature. (Willem)
3. Open the gate of bpf_setsockopt for the SO_TIMESTAMPING feature in
some TX/RX cases, not all the cases.
Jason Xing (12):
bpf: add networking timestamping support to bpf_get/setsockopt()
bpf: prepare the sock_ops ctx and call bpf prog for TX timestamping
bpf: prevent unsafe access to the sock fields in the BPF timestamping
callback
bpf: disable unsafe helpers in TX timestamping callbacks
net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING
bpf: add BPF_SOCK_OPS_TS_SCHED_OPT_CB callback
bpf: add BPF_SOCK_OPS_TS_SW_OPT_CB callback
bpf: add BPF_SOCK_OPS_TS_HW_OPT_CB callback
bpf: add BPF_SOCK_OPS_TS_ACK_OPT_CB callback
bpf: add BPF_SOCK_OPS_TS_SND_CB callback
bpf: support selective sampling for bpf timestamping
selftests/bpf: add simple bpf tests in the tx path for timestamping
feature
include/linux/filter.h | 1 +
include/linux/skbuff.h | 12 +-
include/net/sock.h | 10 +
include/net/tcp.h | 7 +-
include/uapi/linux/bpf.h | 30 +++
kernel/bpf/btf.c | 1 +
net/core/dev.c | 3 +-
net/core/filter.c | 80 +++++-
net/core/skbuff.c | 50 ++++
net/core/sock.c | 14 +
net/dsa/user.c | 2 +-
net/ipv4/tcp.c | 6 +-
net/ipv4/tcp_input.c | 2 +
net/ipv4/tcp_output.c | 2 +
net/socket.c | 2 +-
tools/include/uapi/linux/bpf.h | 23 ++
.../bpf/prog_tests/net_timestamping.c | 231 +++++++++++++++++
.../selftests/bpf/progs/net_timestamping.c | 244 ++++++++++++++++++
18 files changed, 706 insertions(+), 14 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/net_timestamping.c
create mode 100644 tools/testing/selftests/bpf/progs/net_timestamping.c
--
2.43.5
Powered by blists - more mailing lists