[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL+tcoDkFvWmt+GE43sK2pw345hhnVhSv=2+89Wc=s52DHeA1g@mail.gmail.com>
Date: Sat, 11 Jan 2025 06:46:26 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, dsahern@...nel.org, willemdebruijn.kernel@...il.com,
willemb@...gle.com, ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
eddyz87@...il.com, song@...nel.org, yonghong.song@...ux.dev,
john.fastabend@...il.com, kpsingh@...nel.org, sdf@...ichev.me,
haoluo@...gle.com, jolsa@...nel.org, bpf@...r.kernel.org,
netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next v4 10/11] net-timestamp: export the tskey for TCP
bpf extension
On Sat, Jan 11, 2025 at 4:36 AM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
>
> On 1/7/25 8:21 PM, Jason Xing wrote:
> > Hi Martin,
> >
> >>> - bpf_skops_tx_timestamping(sk, skb, op, 2, args);
> >>> + if (sk_is_tcp(sk))
> >>> + args[2] = skb_shinfo(skb)->tskey;
> >>
> >> Instead of only passing one info "skb_shinfo(skb)->tskey" of a skb, pass the
> >> whole skb ptr to the bpf prog. Take a look at bpf_skops_init_skb. Lets start
> >> with end_offset = 0 for now so that the bpf prog won't use it to read the
> >> skb->data. It can be revisited later.
> >>
> >> bpf_skops_init_skb(&sock_ops, skb, 0);
> >>
> >> The bpf prog can use bpf_cast_to_kern_ctx() and bpf_core_cast() to get to the
> >> skb_shinfo(skb). Take a look at the md_skb example in type_cast.c.
> >
> > In recent days, I've been working on this part. It turns out to be
> > infeasible to pass "struct __sk_buff *skb" as the second parameter in
> > skops_sockopt() in patch [11/11]. I cannot find a way to acquire the
> > skb itself
>
> I didn't mean to pass skb in sock_ops_kern->args[] or pass skb to the bpf prog
> "SEC("sockops") skops_sockopt(struct bpf_sock_ops *skops, struct sk_buff *skb)".
> The bpf prog can only take one ctx argument which is
> "struct bpf_sock_ops *skops" here.
>
> I meant to have kernel initializing the sock_ops_kern->skb by doing
> "bpf_skops_init_skb(&sock_ops, skb, 0);" before running the bpf prog.
>
> The bpf prog can read the skb by using bpf_cast_to_kern_ctx() and bpf_core_cast().
>
> Something like the following. I directly change the existing test_tcp_hdr_options.c.
> It has not been changed to use the vmlinux.h, so I need to redefine some parts of
> the sk_buff, skb_shared_info, and bpf_sock_ops_kern. Your new test should directly
> include <vmlinux.h> and no need to redefine them.
>
> Untested code:
>
> diff --git i/tools/testing/selftests/bpf/progs/test_tcp_hdr_options.c w/tools/testing/selftests/bpf/progs/test_tcp_hdr_options.c
> index 5f4e87ee949a..c98ebe71f6ba 100644
> --- i/tools/testing/selftests/bpf/progs/test_tcp_hdr_options.c
> +++ w/tools/testing/selftests/bpf/progs/test_tcp_hdr_options.c
> @@ -12,8 +12,10 @@
> #include <linux/types.h>
> #include <bpf/bpf_helpers.h>
> #include <bpf/bpf_endian.h>
> +#include <bpf/bpf_core_read.h>
> #define BPF_PROG_TEST_TCP_HDR_OPTIONS
> #include "test_tcp_hdr_options.h"
> +#include "bpf_kfuncs.h"
>
> #ifndef sizeof_field
> #define sizeof_field(TYPE, MEMBER) sizeof((((TYPE *)0)->MEMBER))
> @@ -348,9 +350,63 @@ static int current_mss_opt_len(struct bpf_sock_ops *skops)
> return CG_OK;
> }
>
> +struct sk_buff {
> + unsigned int end;
> + unsigned char *head;
> +} __attribute__((preserve_access_index));
> +
> +struct skb_shared_info {
> + __u8 flags;
> + __u8 meta_len;
> + __u8 nr_frags;
> + __u8 tx_flags;
> + unsigned short gso_size;
> + unsigned short gso_segs;
> + unsigned int gso_type;
> + __u32 tskey;
> +} __attribute__((preserve_access_index));
> +
> +struct bpf_sock_ops_kern {
> + struct sock *sk;
> + union {
> + __u32 args[4];
> + __u32 reply;
> + __u32 replylong[4];
> + };
> + struct sk_buff *syn_skb;
> + struct sk_buff *skb;
> + void *skb_data_end;
> + __u8 op;
> + __u8 is_fullsock;
> + __u8 remaining_opt_len;
> + __u64 temp; /* temp and everything after is not
> + * initialized to 0 before calling
> + * the BPF program. New fields that
> + * should be initialized to 0 should
> + * be inserted before temp.
> + * temp is scratch storage used by
> + * sock_ops_convert_ctx_access
> + * as temporary storage of a register.
> + */
> +} __attribute__((preserve_access_index));
> +
> static int handle_hdr_opt_len(struct bpf_sock_ops *skops)
> {
> __u8 tcp_flags = skops_tcp_flags(skops);
> + struct bpf_sock_ops_kern *skops_kern;
> + struct skb_shared_info *shared_info;
> + struct sk_buff *skb;
> +
> + skops_kern = bpf_cast_to_kern_ctx(skops);
Oh, I misunderstood the use of bpf_cast_to_kern_ctx() function and
failed/struggled to fetch the "struct bpf_sock_ops_kern".
Now I realized. Thanks so much for your detailed codes! I will try
this in a few hours.
> + skb = skops_kern->skb;
> +
> + if (skb) {
> + shared_info = bpf_core_cast(skb->head + skb->end, struct skb_shared_info);
> + /* printk as an example. don't do that in selftests. */
> + bpf_printk("tskey %u gso_size %u gso_segs %u gso_type %u flags %x\n",
> + shared_info->tskey, shared_info->gso_size,
> + shared_info->gso_segs, shared_info->gso_type, shared_info->flags);
> + }
>
> if ((tcp_flags & TCPHDR_SYNACK) == TCPHDR_SYNACK)
> return synack_opt_len(skops);
>
>
Powered by blists - more mailing lists