netdev - Re: [PATCH bpf-next v12 01/12] bpf: add networking timestamping support to bpf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <67b5f4f5990b0_1b78d829412@willemb.c.googlers.com.notmuch>
Date: Wed, 19 Feb 2025 10:12:53 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Jason Xing <kerneljasonxing@...il.com>, 
 Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: Martin KaFai Lau <martin.lau@...ux.dev>, 
 davem@...emloft.net, 
 edumazet@...gle.com, 
 kuba@...nel.org, 
 pabeni@...hat.com, 
 dsahern@...nel.org, 
 willemb@...gle.com, 
 ast@...nel.org, 
 daniel@...earbox.net, 
 andrii@...nel.org, 
 eddyz87@...il.com, 
 song@...nel.org, 
 yonghong.song@...ux.dev, 
 john.fastabend@...il.com, 
 kpsingh@...nel.org, 
 sdf@...ichev.me, 
 haoluo@...gle.com, 
 jolsa@...nel.org, 
 shuah@...nel.org, 
 ykolal@...com, 
 bpf@...r.kernel.org, 
 netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next v12 01/12] bpf: add networking timestamping
 support to bpf_get/setsockopt()

> > > Now I wonder if I should use the u8 sk_bpf_cb_flags in V13 or just
> > > keep it as-is? Either way is fine with me :) bpf_sock_ops_cb_flags
> > > uses u8 as an example, thus I think we prefer the former?
> >
> > If it fits in a u8 and that in practice also results in less memory
> > and cache pressure (i.e., does not just add a 24b hole), then it is a
> > net improvement.
> 
> Probably I didn't state it clearly. I agree with you on saving memory:)
> 
> In the previous response, I was trying to keep the sk_bpf_cb_flags
> flag and use a u8 instead. I admit u32 is too large after you noticed
> this.
> 
> Would the following diff on top of this series be acceptable for you?
> And would it be a proper place to put the u8 sk_bpf_cb_flags in struct
> sock?
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 6f4d54faba92..e85d6fb3a2ba 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -447,7 +447,7 @@ struct sock {
>         int                     sk_forward_alloc;
>         u32                     sk_tsflags;
>  #define SK_BPF_CB_FLAG_TEST(SK, FLAG) ((SK)->sk_bpf_cb_flags & (FLAG))
> -       u32                     sk_bpf_cb_flags;
> +       u8                      sk_bpf_cb_flags;
>         __cacheline_group_end(sock_write_rxtx);
> 
>         __cacheline_group_begin(sock_write_tx);
> 
> The following output is the result of running 'pahole --hex -C sock vmlinux'.
> Before this series:
>         u32                        sk_tsflags;           /* 0x168   0x4 */
>         __u8
> __cacheline_group_end__sock_write_rxtx[0]; /* 0x16c     0 */
>         __u8
> __cacheline_group_begin__sock_write_tx[0]; /* 0x16c     0 */
>         int                        sk_write_pending;     /* 0x16c   0x4 */
>         atomic_t                   sk_omem_alloc;        /* 0x170   0x4 */
>         int                        sk_sndbuf;            /* 0x174   0x4 */
>         int                        sk_wmem_queued;       /* 0x178   0x4 */
>         refcount_t                 sk_wmem_alloc;        /* 0x17c   0x4 */
>         /* --- cacheline 6 boundary (384 bytes) --- */
>         long unsigned int          sk_tsq_flags;         /* 0x180   0x8 */
> ...
> /* sum members: 773, holes: 1, sum holes: 1 */
> 
> After this diff patch:
>         u32                        sk_tsflags;           /* 0x168   0x4 */
>         u8                         sk_bpf_cb_flags;      /* 0x16c   0x1 */
>         __u8
> __cacheline_group_end__sock_write_rxtx[0]; /* 0x16d     0 */
>         __u8
> __cacheline_group_begin__sock_write_tx[0]; /* 0x16d     0 */
> 
>         /* XXX 3 bytes hole, try to pack */
> 
>         int                        sk_write_pending;     /* 0x170   0x4 */
>         atomic_t                   sk_omem_alloc;        /* 0x174   0x4 */
>         int                        sk_sndbuf;            /* 0x178   0x4 */
>         int                        sk_wmem_queued;       /* 0x17c   0x4 */
>         /* --- cacheline 6 boundary (384 bytes) --- */
>         refcount_t                 sk_wmem_alloc;        /* 0x180   0x4 */
> 
>         /* XXX 4 bytes hole, try to pack */
> 
>         long unsigned int          sk_tsq_flags;         /* 0x188   0x8 */
> ...
> /* sum members: 774, holes: 3, sum holes: 8 */
> 
> It will introduce 7 extra sum holes if this series with this u8 change
> gets applied. I think it's a proper position because this new
> sk_bpf_cb_flags will be used in the tx and rx path just like
> sk_tsflags, aligned with rules introduced by the commit[1].

Reducing a u64 to u8 can leave 7b of holes, but that is not great,
of course.

Since this bitmap is only touched if a BPF program is loaded, arguably
it need not be in the hot path cacheline groups.

Can you find a hole further down to place this in, or at least a spot
that does not result in 7b of wasted space (in the hotpath cacheline
groups of all places).