lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL+tcoCmXcDot-855XYU7PKCiGvJL=O3CQBGuOTRAs2_=Ys=gg@mail.gmail.com>
Date: Wed, 5 Feb 2025 13:28:56 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, 
	pabeni@...hat.com, dsahern@...nel.org, willemdebruijn.kernel@...il.com, 
	willemb@...gle.com, ast@...nel.org, daniel@...earbox.net, andrii@...nel.org, 
	martin.lau@...ux.dev, eddyz87@...il.com, song@...nel.org, 
	yonghong.song@...ux.dev, john.fastabend@...il.com, kpsingh@...nel.org, 
	sdf@...ichev.me, haoluo@...gle.com, jolsa@...nel.org, horms@...nel.org
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next v8 11/12] bpf: add a new callback in tcp_tx_timestamp()

On Wed, Feb 5, 2025 at 2:31 AM Jason Xing <kerneljasonxing@...il.com> wrote:
>
> Introduce the callback to correlate tcp_sendmsg timestamp with other
> points, like SND/SW/ACK. let bpf prog trace the beginning of
> tcp_sendmsg_locked() and then store the sendmsg timestamp at
> the bpf_sk_storage, so that in tcp_tx_timestamp() we can correlate
> the timestamp with tskey which can be found in other sending points.
>
> More details can be found in the selftest:
> The selftest uses the bpf_sk_storage to store the sendmsg timestamp at
> fentry/tcp_sendmsg_locked and retrieves it back at tcp_tx_timestamp
> (i.e. BPF_SOCK_OPS_TS_SND_CB added in this patch).
>
> Signed-off-by: Jason Xing <kerneljasonxing@...il.com>
> ---
>  include/uapi/linux/bpf.h       | 7 +++++++
>  net/ipv4/tcp.c                 | 1 +
>  tools/include/uapi/linux/bpf.h | 7 +++++++
>  3 files changed, 15 insertions(+)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 800122a8abe5..accb3b314fff 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -7052,6 +7052,13 @@ enum {
>                                          * when SK_BPF_CB_TX_TIMESTAMPING
>                                          * feature is on.
>                                          */
> +       BPF_SOCK_OPS_TS_SND_CB,         /* Called when every sendmsg syscall
> +                                        * is triggered. For TCP, it stays
> +                                        * in the last send process to
> +                                        * correlate with tcp_sendmsg timestamp
> +                                        * with other timestamping callbacks,
> +                                        * like SND/SW/ACK.
> +                                        */
>  };

In case the use of the new flag is buried in many threads, I decide to
rephrase here to manifest how UDP would use it:
1. introduce a field ts_opt_id_bpf which works like ts_opt_id[1] to allow
the bpf program to fully take control of the management of tskey.
2. use fentry hook udp_sendmsg(), and introduce a callback function
like BPF_SOCK_OPS_TIMEOUT_INIT in kernel to initialize the
ts_opt_id_bpf with tskey that bpf prog generates. We can directly use
BPF_SOCK_OPS_TS_SND_CB.
3. modify the SCM_TS_OPT_ID logic to support bpf extension so that the
newly added field ts_opt_id_bpf can be passed to the
skb_shinfo(skb)->tskey in __ip_append_data().

In this way, this approach can also be extended for other protocols.

[1]
commit 4aecca4c76808f3736056d18ff510df80424bc9f
Author: Vadim Fedorenko <vadim.fedorenko@...ux.dev>
Date:   Tue Oct 1 05:57:14 2024 -0700

    net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message

    SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX
    timestamps and packets sent via socket. Unfortunately, there is no way
    to reliably predict socket timestamp ID value in case of error returned
    by sendmsg. For UDP sockets it's impossible because of lockless
    nature of UDP transmit, several threads may send packets in parallel. In
    case of RAW sockets MSG_MORE option makes things complicated. More
    details are in the conversation [1].
    This patch adds new control message type to give user-space
    software an opportunity to control the mapping between packets and
    values by providing ID with each sendmsg for UDP sockets.
    The documentation is also added in this patch.

    [1] https://lore.kernel.org/netdev/CALCETrU0jB+kg0mhV6A8mrHfTE1D1pr1SD_B9Eaa9aDPfgHdtA@mail.gmail.com/

Thanks,
Jason

>
>  /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 3df802410ebf..a2ac57543b6d 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -501,6 +501,7 @@ static void tcp_tx_timestamp(struct sock *sk, struct sockcm_cookie *sockc)
>                 tcb->txstamp_ack_bpf = 1;
>                 shinfo->tx_flags |= SKBTX_BPF;
>                 shinfo->tskey = TCP_SKB_CB(skb)->seq + skb->len - 1;
> +               bpf_skops_tx_timestamping(sk, skb, BPF_SOCK_OPS_TS_SND_CB);
>         }
>  }
>
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 06e68d772989..384502996cdd 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -7045,6 +7045,13 @@ enum {
>                                          * when SK_BPF_CB_TX_TIMESTAMPING
>                                          * feature is on.
>                                          */
> +       BPF_SOCK_OPS_TS_SND_CB,         /* Called when every sendmsg syscall
> +                                        * is triggered. For TCP, it stays
> +                                        * in the last send process to
> +                                        * correlate with tcp_sendmsg timestamp
> +                                        * with other timestamping callbacks,
> +                                        * like SND/SW/ACK.
> +                                        */
>  };
>
>  /* List of TCP states. There is a build check in net/ipv4/tcp.c to detect
> --
> 2.43.5
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ