[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL+tcoCmXcDot-855XYU7PKCiGvJL=O3CQBGuOTRAs2_=Ys=gg@mail.gmail.com>
Date: Wed, 5 Feb 2025 13:28:56 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, dsahern@...nel.org, willemdebruijn.kernel@...il.com,
willemb@...gle.com, ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
martin.lau@...ux.dev, eddyz87@...il.com, song@...nel.org,
yonghong.song@...ux.dev, john.fastabend@...il.com, kpsingh@...nel.org,
sdf@...ichev.me, haoluo@...gle.com, jolsa@...nel.org, horms@...nel.org
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next v8 11/12] bpf: add a new callback in tcp_tx_timestamp()
On Wed, Feb 5, 2025 at 2:31 AM Jason Xing <kerneljasonxing@...il.com> wrote:
>
> Introduce the callback to correlate tcp_sendmsg timestamp with other
> points, like SND/SW/ACK. let bpf prog trace the beginning of
> tcp_sendmsg_locked() and then store the sendmsg timestamp at
> the bpf_sk_storage, so that in tcp_tx_timestamp() we can correlate
> the timestamp with tskey which can be found in other sending points.
>
> More details can be found in the selftest:
> The selftest uses the bpf_sk_storage to store the sendmsg timestamp at
> fentry/tcp_sendmsg_locked and retrieves it back at tcp_tx_timestamp
> (i.e. BPF_SOCK_OPS_TS_SND_CB added in this patch).
>
> Signed-off-by: Jason Xing <kerneljasonxing@...il.com>
> ---
> include/uapi/linux/bpf.h | 7 +++++++
> net/ipv4/tcp.c | 1 +
> tools/include/uapi/linux/bpf.h | 7 +++++++
> 3 files changed, 15 insertions(+)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 800122a8abe5..accb3b314fff 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -7052,6 +7052,13 @@ enum {
> * when SK_BPF_CB_TX_TIMESTAMPING
> * feature is on.
> */
> + BPF_SOCK_OPS_TS_SND_CB, /* Called when every sendmsg syscall
> + * is triggered. For TCP, it stays
> + * in the last send process to
> + * correlate with tcp_sendmsg timestamp
> + * with other timestamping callbacks,
> + * like SND/SW/ACK.
> + */
> };
In case the use of the new flag is buried in many threads, I decide to
rephrase here to manifest how UDP would use it:
1. introduce a field ts_opt_id_bpf which works like ts_opt_id[1] to allow
the bpf program to fully take control of the management of tskey.
2. use fentry hook udp_sendmsg(), and introduce a callback function
like BPF_SOCK_OPS_TIMEOUT_INIT in kernel to initialize the
ts_opt_id_bpf with tskey that bpf prog generates. We can directly use
BPF_SOCK_OPS_TS_SND_CB.
3. modify the SCM_TS_OPT_ID logic to support bpf extension so that the
newly added field ts_opt_id_bpf can be passed to the
skb_shinfo(skb)->tskey in __ip_append_data().
In this way, this approach can also be extended for other protocols.
[1]
commit 4aecca4c76808f3736056d18ff510df80424bc9f
Author: Vadim Fedorenko <vadim.fedorenko@...ux.dev>
Date: Tue Oct 1 05:57:14 2024 -0700
net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message
SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX
timestamps and packets sent via socket. Unfortunately, there is no way
to reliably predict socket timestamp ID value in case of error returned
by sendmsg. For UDP sockets it's impossible because of lockless
nature of UDP transmit, several threads may send packets in parallel. In
case of RAW sockets MSG_MORE option makes things complicated. More
details are in the conversation [1].
This patch adds new control message type to give user-space
software an opportunity to control the mapping between packets and
values by providing ID with each sendmsg for UDP sockets.
The documentation is also added in this patch.
[1] https://lore.kernel.org/netdev/CALCETrU0jB+kg0mhV6A8mrHfTE1D1pr1SD_B9Eaa9aDPfgHdtA@mail.gmail.com/
Thanks,
Jason
>
> /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 3df802410ebf..a2ac57543b6d 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -501,6 +501,7 @@ static void tcp_tx_timestamp(struct sock *sk, struct sockcm_cookie *sockc)
> tcb->txstamp_ack_bpf = 1;
> shinfo->tx_flags |= SKBTX_BPF;
> shinfo->tskey = TCP_SKB_CB(skb)->seq + skb->len - 1;
> + bpf_skops_tx_timestamping(sk, skb, BPF_SOCK_OPS_TS_SND_CB);
> }
> }
>
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 06e68d772989..384502996cdd 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -7045,6 +7045,13 @@ enum {
> * when SK_BPF_CB_TX_TIMESTAMPING
> * feature is on.
> */
> + BPF_SOCK_OPS_TS_SND_CB, /* Called when every sendmsg syscall
> + * is triggered. For TCP, it stays
> + * in the last send process to
> + * correlate with tcp_sendmsg timestamp
> + * with other timestamping callbacks,
> + * like SND/SW/ACK.
> + */
> };
>
> /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect
> --
> 2.43.5
>
Powered by blists - more mailing lists