lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67a3d732aeff0_170d392942b@willemb.c.googlers.com.notmuch>
Date: Wed, 05 Feb 2025 16:25:06 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Jason Xing <kerneljasonxing@...il.com>, 
 Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: davem@...emloft.net, 
 edumazet@...gle.com, 
 kuba@...nel.org, 
 pabeni@...hat.com, 
 dsahern@...nel.org, 
 willemb@...gle.com, 
 ast@...nel.org, 
 daniel@...earbox.net, 
 andrii@...nel.org, 
 martin.lau@...ux.dev, 
 eddyz87@...il.com, 
 song@...nel.org, 
 yonghong.song@...ux.dev, 
 john.fastabend@...il.com, 
 kpsingh@...nel.org, 
 sdf@...ichev.me, 
 haoluo@...gle.com, 
 jolsa@...nel.org, 
 horms@...nel.org, 
 bpf@...r.kernel.org, 
 netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next v8 01/12] bpf: add support for bpf_setsockopt()

Jason Xing wrote:
> On Wed, Feb 5, 2025 at 11:22 PM Willem de Bruijn
> <willemdebruijn.kernel@...il.com> wrote:
> >
> > Jason Xing wrote:
> > > Users can write the following code to enable the bpf extension:
> > > int flags = SK_BPF_CB_TX_TIMESTAMPING;
> > > int opts = SK_BPF_CB_FLAGS;
> > > bpf_setsockopt(skops, SOL_SOCKET, opts, &flags, sizeof(flags));
> > >
> > > Signed-off-by: Jason Xing <kerneljasonxing@...il.com>
> > > ---
> > >  include/net/sock.h             |  3 +++
> > >  include/uapi/linux/bpf.h       |  8 ++++++++
> > >  net/core/filter.c              | 23 +++++++++++++++++++++++
> > >  tools/include/uapi/linux/bpf.h |  1 +
> > >  4 files changed, 35 insertions(+)
> > >
> > > diff --git a/include/net/sock.h b/include/net/sock.h
> > > index 8036b3b79cd8..7916982343c6 100644
> > > --- a/include/net/sock.h
> > > +++ b/include/net/sock.h
> > > @@ -303,6 +303,7 @@ struct sk_filter;
> > >    *  @sk_stamp: time stamp of last packet received
> > >    *  @sk_stamp_seq: lock for accessing sk_stamp on 32 bit architectures only
> > >    *  @sk_tsflags: SO_TIMESTAMPING flags
> > > +  *  @sk_bpf_cb_flags: used in bpf_setsockopt()
> > >    *  @sk_use_task_frag: allow sk_page_frag() to use current->task_frag.
> > >    *                     Sockets that can be used under memory reclaim should
> > >    *                     set this to false.
> > > @@ -445,6 +446,8 @@ struct sock {
> > >       u32                     sk_reserved_mem;
> > >       int                     sk_forward_alloc;
> > >       u32                     sk_tsflags;
> > > +#define SK_BPF_CB_FLAG_TEST(SK, FLAG) ((SK)->sk_bpf_cb_flags & (FLAG))
> > > +     u32                     sk_bpf_cb_flags;
> > >       __cacheline_group_end(sock_write_rxtx);
> > >
> > >       __cacheline_group_begin(sock_write_tx);
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index 2acf9b336371..6116eb3d1515 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -6913,6 +6913,13 @@ enum {
> > >       BPF_SOCK_OPS_ALL_CB_FLAGS       = 0x7F,
> > >  };
> > >
> > > +/* Definitions for bpf_sk_cb_flags */
> > > +enum {
> > > +     SK_BPF_CB_TX_TIMESTAMPING       = 1<<0,
> > > +     SK_BPF_CB_MASK                  = (SK_BPF_CB_TX_TIMESTAMPING - 1) |
> > > +                                        SK_BPF_CB_TX_TIMESTAMPING
> > > +};
> > > +
> > >  /* List of known BPF sock_ops operators.
> > >   * New entries can only be added at the end
> > >   */
> > > @@ -7091,6 +7098,7 @@ enum {
> > >       TCP_BPF_SYN_IP          = 1006, /* Copy the IP[46] and TCP header */
> > >       TCP_BPF_SYN_MAC         = 1007, /* Copy the MAC, IP[46], and TCP header */
> > >       TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Get or Set TCP sock ops flags */
> > > +     SK_BPF_CB_FLAGS         = 1009, /* Used to set socket bpf flags */
> > >  };
> > >
> > >  enum {
> > > diff --git a/net/core/filter.c b/net/core/filter.c
> > > index 2ec162dd83c4..1c6c07507a78 100644
> > > --- a/net/core/filter.c
> > > +++ b/net/core/filter.c
> > > @@ -5222,6 +5222,25 @@ static const struct bpf_func_proto bpf_get_socket_uid_proto = {
> > >       .arg1_type      = ARG_PTR_TO_CTX,
> > >  };
> > >
> > > +static int sk_bpf_set_get_cb_flags(struct sock *sk, char *optval, bool getopt)
> > > +{
> > > +     u32 sk_bpf_cb_flags;
> > > +
> > > +     if (getopt) {
> > > +             *(u32 *)optval = sk->sk_bpf_cb_flags;
> > > +             return 0;
> > > +     }
> > > +
> > > +     sk_bpf_cb_flags = *(u32 *)optval;
> > > +
> > > +     if (sk_bpf_cb_flags & ~SK_BPF_CB_MASK)
> > > +             return -EINVAL;
> > > +
> > > +     sk->sk_bpf_cb_flags = sk_bpf_cb_flags;
> >
> > I don't know BPF internals that well:
> >
> > Is there mutual exclusion between these sol_socket_sockopt calls?
> > Or do these sk field accesses need WRITE_ONCE/READ_ONCE.
> 
> According to the existing callbacks (like
> BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB which I used in the selftests) in
> include/uapi/linux/bpf.h, they are under the socket lock protection.
> And the correct use of this feature is to set during the 3-way
> handshake that also is protected by lock. But after you remind me of
> this potential data race issue, just in case bpf program doesn't use
> it as we expect, I think I will add the this annotation in v9.

Let's not add instrumentation defensively where not needed. Doing so
confuses future readers who will assume that it was needed and cannot
see why.

Either leave it for now. Or if it is needed for lockless UDP, add it
now, but then add an explicit comment that it is for that use case.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ