[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoBdUJSTGKG2UhUtNi6z6CvkYR-QdgWd62K4xGi1RXyqqw@mail.gmail.com>
Date: Wed, 15 Jan 2025 08:43:35 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, dsahern@...nel.org, willemdebruijn.kernel@...il.com,
willemb@...gle.com, ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
eddyz87@...il.com, song@...nel.org, yonghong.song@...ux.dev,
john.fastabend@...il.com, kpsingh@...nel.org, sdf@...ichev.me,
haoluo@...gle.com, jolsa@...nel.org, horms@...nel.org, bpf@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: [PATCH net-next v5 02/15] net-timestamp: prepare for bpf prog use
On Wed, Jan 15, 2025 at 8:37 AM Jason Xing <kerneljasonxing@...il.com> wrote:
>
> On Wed, Jan 15, 2025 at 8:26 AM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
> >
> > On 1/14/25 4:15 PM, Jason Xing wrote:
> > > On Wed, Jan 15, 2025 at 8:09 AM Jason Xing <kerneljasonxing@...il.com> wrote:
> > >>
> > >> On Wed, Jan 15, 2025 at 7:40 AM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
> > >>>
> > >>> On 1/12/25 3:37 AM, Jason Xing wrote:
> > >>>> Later, I would introduce three points to report some information
> > >>>> to user space based on this.
> > >>>>
> > >>>> Signed-off-by: Jason Xing <kerneljasonxing@...il.com>
> > >>>> ---
> > >>>> include/net/sock.h | 7 +++++++
> > >>>> net/core/sock.c | 14 ++++++++++++++
> > >>>> 2 files changed, 21 insertions(+)
> > >>>>
> > >>>> diff --git a/include/net/sock.h b/include/net/sock.h
> > >>>> index f5447b4b78fd..dd874e8337c0 100644
> > >>>> --- a/include/net/sock.h
> > >>>> +++ b/include/net/sock.h
> > >>>> @@ -2930,6 +2930,13 @@ int sock_set_timestamping(struct sock *sk, int optname,
> > >>>> struct so_timestamping timestamping);
> > >>>>
> > >>>> void sock_enable_timestamps(struct sock *sk);
> > >>>> +#if defined(CONFIG_CGROUP_BPF) && defined(CONFIG_BPF_SYSCALL)
> > >>>> +void bpf_skops_tx_timestamping(struct sock *sk, struct sk_buff *skb, int op);
> > >>>> +#else
> > >>>> +static inline void bpf_skops_tx_timestamping(struct sock *sk, struct sk_buff *skb, int op)
> > >>>> +{
> > >>>> +}
> > >>>> +#endif
> > >>>> void sock_no_linger(struct sock *sk);
> > >>>> void sock_set_keepalive(struct sock *sk);
> > >>>> void sock_set_priority(struct sock *sk, u32 priority);
> > >>>> diff --git a/net/core/sock.c b/net/core/sock.c
> > >>>> index eae2ae70a2e0..e06bcafb1b2d 100644
> > >>>> --- a/net/core/sock.c
> > >>>> +++ b/net/core/sock.c
> > >>>> @@ -948,6 +948,20 @@ int sock_set_timestamping(struct sock *sk, int optname,
> > >>>> return 0;
> > >>>> }
> > >>>>
> > >>>> +#if defined(CONFIG_CGROUP_BPF) && defined(CONFIG_BPF_SYSCALL)
> > >>>> +void bpf_skops_tx_timestamping(struct sock *sk, struct sk_buff *skb, int op)
> > >>>> +{
> > >>>> + struct bpf_sock_ops_kern sock_ops;
> > >>>> +
> > >>>> + memset(&sock_ops, 0, offsetof(struct bpf_sock_ops_kern, temp));
> > >>>> + sock_ops.op = op;
> > >>>> + if (sk_is_tcp(sk) && sk_fullsock(sk))
> > >>>> + sock_ops.is_fullsock = 1;
> > >>>> + sock_ops.sk = sk;
> > >>>> + __cgroup_bpf_run_filter_sock_ops(sk, &sock_ops, CGROUP_SOCK_OPS);
> > >>>
> > >>> hmm... I think I have already mentioned it in the earlier revision
> > >>> (https://lore.kernel.org/bpf/f8e9ab4a-38b9-43a5-aaf4-15f95a3463d0@linux.dev/).
> > >>
> > >> Right, sorry, but I deleted it intentionally.
> > >>
> > >>>
> > >>> __cgroup_bpf_run_filter_sock_ops(sk, ...) requires sk to be fullsock.
> > >>
> > >> Well, I don't understand it, BPF_CGROUP_RUN_PROG_SOCK_OPS_SK() don't
> > >> need to check whether it is fullsock or not.
> >
> > It is because the callers of BPF_CGROUP_RUN_PROG_SOCK_OPS_SK guarantees it is
> > fullsock.
> >
> > >>
> > >>> Take a look at how BPF_CGROUP_RUN_PROG_SOCK_OPS does it.
> > >>> sk_to_full_sk() is used to get back the listener. For other mini socks,
> > >>> it needs to skip calling the cgroup bpf prog. I still don't understand
> > >>> why BPF_CGROUP_RUN_PROG_SOCK_OPS cannot be used here because of udp.
> > >>
> > >> Sorry, I got lost here. BPF_CGROUP_RUN_PROG_SOCK_OPS cannot support
> > >> udp, right? And I think we've discussed that we have to get rid of the
> > >> limitation of fullsock.
> >
> > It is the part I am missing. Why BPF_CGROUP_RUN_PROG_SOCK_OPS cannot support
> > udp? UDP is not a fullsock?
>
> No, you're not missing anything. UDP is a fullsock and
> BPF_CGROUP_RUN_PROG_SOCK_OPS itself can support udp as my v3 version
> used this method already like you suggest. I feel like
> misunderstanding what you really suggest. Sorry for the trouble
> caused.
>
> I wonder if using is_fullsock would affect/break the usage of fetching
> some fields, especially tcp related fields, in
> sock_ops_convert_ctx_access()? Sorry that I'm not a bpf expert :(
>
> If not, I will use BPF_CGROUP_RUN_PROG_SOCK_OPS instead.
To be clearer, I would use the following code snippet in the next respin:
+void bpf_skops_tx_timestamping(struct sock *sk, struct sk_buff *skb, int op)
+{
+ struct bpf_sock_ops_kern sock_ops;
+
+ memset(&sock_ops, 0, offsetof(struct bpf_sock_ops_kern, temp));
+ sock_ops.op = op;
+ sock_ops.is_fullsock = 1;
+ sock_ops.sk = sk;
+ BPF_CGROUP_RUN_PROG_SOCK_OPS(sk, &sock_ops, CGROUP_SOCK_OPS);
+}
Thanks,
Jason
Powered by blists - more mailing lists