[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a483c1dd-f593-4f6b-9afe-bfb6d43647bf@linux.dev>
Date: Mon, 10 Feb 2025 15:37:19 -0800
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Jason Xing <kerneljasonxing@...il.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, dsahern@...nel.org, willemdebruijn.kernel@...il.com,
willemb@...gle.com, ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
eddyz87@...il.com, song@...nel.org, yonghong.song@...ux.dev,
john.fastabend@...il.com, kpsingh@...nel.org, sdf@...ichev.me,
haoluo@...gle.com, jolsa@...nel.org, horms@...nel.org, bpf@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next v9 00/12] net-timestamp: bpf extension to equip
applications transparently
On 2/8/25 2:32 AM, Jason Xing wrote:
> "Timestamping is key to debugging network stack latency. With
> SO_TIMESTAMPING, bugs that are otherwise incorrectly assumed to be
> network issues can be attributed to the kernel." This is extracted
> from the talk "SO_TIMESTAMPING: Powering Fleetwide RPC Monitoring"
> addressed by Willem de Bruijn at netdevconf 0x17).
>
> There are a few areas that need optimization with the consideration of
> easier use and less performance impact, which I highlighted and mainly
> discussed at netconf 2024 with Willem de Bruijn and John Fastabend:
> uAPI compatibility, extra system call overhead, and the need for
> application modification. I initially managed to solve these issues
> by writing a kernel module that hooks various key functions. However,
> this approach is not suitable for the next kernel release. Therefore,
> a BPF extension was proposed. During recent period, Martin KaFai Lau
> provides invaluable suggestions about BPF along the way. Many thanks
> here!
>
> In this series, only support foundamental codes and tx for TCP.
typo: fundamental.... This had been brought up before (in v7?).
By fundamental, I suspect you meant (?) bpf timestamping infrastructure, like:
"This series adds the BPF networking timestamping infrastructure. This series
also adds TX timestamping support for TCP. The RX timestamping and UDP support
will be added in the future."
> This approach mostly relies on existing SO_TIMESTAMPING feature, users
It reuses most of the tx timestamping callback that is currently enabled by the
SO_TIMESTAMPING. However, I don't think there is a lot of overlap in term of the
SO_TIMESTAMPING api which does feel like API reuse when first reading this comment.
> only needs to pass certain flags through bpf_setsocktopt() to a separate
> tsflags. Please see the last selftest patch in this series.
>
> ---
> v8
> Link: https://lore.kernel.org/all/20250128084620.57547-1-kerneljasonxing@gmail.com/
> 1. adjust some commit messages and titles
> 2. add sk cookie in selftests
> 3. handle the NULL pointer in hwstamp
> 4. use kfunc to do selective sampling
>
> v7
> Link: https://lore.kernel.org/all/20250121012901.87763-1-kerneljasonxing@gmail.com/
> 1. target bpf-next tree
> 2. simplely and directly stop timestamping callbacks calling a few BPF
> CALLS due to safety concern.
> 3. add more new testcases and adjust the existing testcases
> 4. revise some comments of new timestamping callbacks
> 5. remove a few BPF CGROUP locks
>
> RFC v6
> In the meantime, any suggestions and reviews are welcome!
> Link: https://lore.kernel.org/all/20250112113748.73504-1-kerneljasonxing@gmail.com/
> 1. handle those safety problem by using the correct method.
> 2. support bpf_getsockopt.
> 3. adjust the position of BPF_SOCK_OPS_TS_TCP_SND_CB
> 4. fix mishandling the hardware timestamp error
> 5. add more corresponding tests
>
> v5
> Link: https://lore.kernel.org/all/20241207173803.90744-1-kerneljasonxing@gmail.com/
> 1. handle the safety issus when someone tries to call unrelated bpf
> helpers.
> 2. avoid adding direct function call in the hot path like
> __dev_queue_xmit()
> 3. remove reporting the hardware timestamp and tskey since they can be
> fetched through the existing helper with the help of
> bpf_skops_init_skb(), please see the selftest.
> 4. add new sendmsg callback in tcp_sendmsg, and introduce tskey_bpf used
> by bpf program to correlate tcp_sendmsg with other hook points in patch [13/15].
>
> v4
> Link: https://lore.kernel.org/all/20241028110535.82999-1-kerneljasonxing@gmail.com/
> 1. introduce sk->sk_bpf_cb_flags to let user use bpf_setsockopt() (Martin)
> 2. introduce SKBTX_BPF to enable the bpf SO_TIMESTAMPING feature (Martin)
> 3. introduce bpf map in tests (Martin)
> 4. I choose to make this series as simple as possible, so I only support
> most cases in the tx path for TCP protocol.
>
> v3
> Link: https://lore.kernel.org/all/20241012040651.95616-1-kerneljasonxing@gmail.com/
> 1. support UDP proto by introducing a new generation point.
> 2. for OPT_ID, introducing sk_tskey_bpf_offset to compute the delta
> between the current socket key and bpf socket key. It is desiged for
> UDP, which also applies to TCP.
> 3. support bpf_getsockopt()
> 4. use cgroup static key instead.
> 5. add one simple bpf selftest to show how it can be used.
> 6. remove the rx support from v2 because the number of patches could
> exceed the limit of one series.
>
> V2
> Link: https://lore.kernel.org/all/20241008095109.99918-1-kerneljasonxing@gmail.com/
> 1. Introduce tsflag requestors so that we are able to extend more in the
> future. Besides, it enables TX flags for bpf extension feature separately
> without breaking users. It is suggested by Vadim Fedorenko.
> 2. introduce a static key to control the whole feature. (Willem)
> 3. Open the gate of bpf_setsockopt for the SO_TIMESTAMPING feature in
> some TX/RX cases, not all the cases.
>
> Jason Xing (12):
> bpf: add support for bpf_setsockopt()
> bpf: prepare for timestamping callbacks use
> bpf: stop unsafely accessing TCP fields in bpf callbacks
> bpf: stop calling some sock_op BPF CALLs in new timestamping callbacks
> net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING
> bpf: support SCM_TSTAMP_SCHED of SO_TIMESTAMPING
> bpf: support sw SCM_TSTAMP_SND of SO_TIMESTAMPING
> bpf: support hw SCM_TSTAMP_SND of SO_TIMESTAMPING
> bpf: support SCM_TSTAMP_ACK of SO_TIMESTAMPING
> bpf: add a new callback in tcp_tx_timestamp()
> bpf: support selective sampling for bpf timestamping
> selftests/bpf: add simple bpf tests in the tx path for timestamping
> feature
>
> include/linux/filter.h | 1 +
> include/linux/skbuff.h | 12 +-
> include/net/sock.h | 10 +
> include/net/tcp.h | 5 +-
> include/uapi/linux/bpf.h | 30 ++
> kernel/bpf/btf.c | 1 +
> net/core/dev.c | 3 +-
> net/core/filter.c | 75 ++++-
> net/core/skbuff.c | 48 ++-
> net/core/sock.c | 15 +
> net/dsa/user.c | 2 +-
> net/ipv4/tcp.c | 4 +
> net/ipv4/tcp_input.c | 2 +
> net/ipv4/tcp_output.c | 2 +
> net/socket.c | 2 +-
> tools/include/uapi/linux/bpf.h | 23 ++
> .../bpf/prog_tests/so_timestamping.c | 79 +++++
> .../selftests/bpf/progs/so_timestamping.c | 312 ++++++++++++++++++
> 18 files changed, 612 insertions(+), 14 deletions(-)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/so_timestamping.c
> create mode 100644 tools/testing/selftests/bpf/progs/so_timestamping.c
>
Powered by blists - more mailing lists