[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67eec501d0d58_14b7b229490@willemb.c.googlers.com.notmuch>
Date: Thu, 03 Apr 2025 13:27:29 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Stanislav Fomichev <stfomichev@...il.com>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: bpf@...r.kernel.org,
netdev@...r.kernel.org,
ast@...nel.org,
daniel@...earbox.net,
john.fastabend@...il.com,
Willem de Bruijn <willemb@...gle.com>,
Matt Moeller <moeller.matt@...il.com>,
Maciej Żenczykowski <maze@...gle.com>
Subject: Re: [PATCH bpf 1/2] bpf: support SKF_NET_OFF and SKF_LL_OFF on skb
frags
Stanislav Fomichev wrote:
> On 04/03, Willem de Bruijn wrote:
> > From: Willem de Bruijn <willemb@...gle.com>
> >
> > Classic BPF socket filters with SKB_NET_OFF and SKB_LL_OFF fail to
> > read when these offsets extend into frags.
> >
> > This has been observed with iwlwifi and reproduced with tun with
> > IFF_NAPI_FRAGS. The below straightforward socket filter on UDP port,
> > applied to a RAW socket, will silently miss matching packets.
> >
> > const int offset_proto = offsetof(struct ip6_hdr, ip6_nxt);
> > const int offset_dport = sizeof(struct ip6_hdr) + offsetof(struct udphdr, dest);
> > struct sock_filter filter_code[] = {
> > BPF_STMT(BPF_LD + BPF_B + BPF_ABS, SKF_AD_OFF + SKF_AD_PKTTYPE),
> > BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, PACKET_HOST, 0, 4),
> > BPF_STMT(BPF_LD + BPF_B + BPF_ABS, SKF_NET_OFF + offset_proto),
> > BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, IPPROTO_UDP, 0, 2),
> > BPF_STMT(BPF_LD + BPF_H + BPF_ABS, SKF_NET_OFF + offset_dport),
> >
> > This is unexpected behavior. Socket filter programs should be
> > consistent regardless of environment. Silent misses are
> > particularly concerning as hard to detect.
> >
> > Use skb_copy_bits for offsets outside linear, same as done for
> > non-SKF_(LL|NET) offsets.
> >
> > Offset is always positive after subtracting the reference threshold
> > SKB_(LL|NET)_OFF, so is always >= skb_(mac|network)_offset. The sum of
> > the two is an offset against skb->data, and may be negative, but it
> > cannot point before skb->head, as skb_(mac|network)_offset would too.
> >
> > This appears to go back to when frag support was introduced to
> > sk_run_filter in linux-2.4.4, before the introduction of git.
> >
> > The amount of code change and 8/16/32 bit duplication are unfortunate.
> > But any attempt I made to be smarter saved very few LoC while
> > complicating the code.
> >
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Link: https://lore.kernel.org/netdev/20250122200402.3461154-1-maze@google.com/
> > Link: https://elixir.bootlin.com/linux/2.4.4/source/net/core/filter.c#L244
> > Reported-by: Matt Moeller <moeller.matt@...il.com>
> > Co-developed-by: Maciej Żenczykowski <maze@...gle.com>
> > Signed-off-by: Maciej Żenczykowski <maze@...gle.com>
> > Signed-off-by: Willem de Bruijn <willemb@...gle.com>
> > ---
> > include/linux/filter.h | 3 --
> > kernel/bpf/core.c | 21 ------------
> > net/core/filter.c | 75 +++++++++++++++++++++++-------------------
> > 3 files changed, 42 insertions(+), 57 deletions(-)
> >
> > diff --git a/include/linux/filter.h b/include/linux/filter.h
> > index f5cf4d35d83e..708ac7e0cd36 100644
> > --- a/include/linux/filter.h
> > +++ b/include/linux/filter.h
> > @@ -1496,9 +1496,6 @@ static inline u16 bpf_anc_helper(const struct sock_filter *ftest)
> > }
> > }
> >
> > -void *bpf_internal_load_pointer_neg_helper(const struct sk_buff *skb,
> > - int k, unsigned int size);
> > -
> > static inline int bpf_tell_extensions(void)
> > {
> > return SKF_AD_MAX;
> > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > index ba6b6118cf50..0e836b5ac9a0 100644
> > --- a/kernel/bpf/core.c
> > +++ b/kernel/bpf/core.c
> > @@ -68,27 +68,6 @@
> > struct bpf_mem_alloc bpf_global_ma;
> > bool bpf_global_ma_set;
> >
> > -/* No hurry in this branch
> > - *
> > - * Exported for the bpf jit load helper.
> > - */
> > -void *bpf_internal_load_pointer_neg_helper(const struct sk_buff *skb, int k, unsigned int size)
> > -{
> > - u8 *ptr = NULL;
> > -
> > - if (k >= SKF_NET_OFF) {
> > - ptr = skb_network_header(skb) + k - SKF_NET_OFF;
> > - } else if (k >= SKF_LL_OFF) {
> > - if (unlikely(!skb_mac_header_was_set(skb)))
> > - return NULL;
> > - ptr = skb_mac_header(skb) + k - SKF_LL_OFF;
> > - }
> > - if (ptr >= skb->head && ptr + size <= skb_tail_pointer(skb))
> > - return ptr;
> > -
> > - return NULL;
> > -}
> > -
> > /* tell bpf programs that include vmlinux.h kernel's PAGE_SIZE */
> > enum page_size_enum {
> > __PAGE_SIZE = PAGE_SIZE
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index bc6828761a47..b232b70dd10d 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -221,21 +221,24 @@ BPF_CALL_3(bpf_skb_get_nlattr_nest, struct sk_buff *, skb, u32, a, u32, x)
> > BPF_CALL_4(bpf_skb_load_helper_8, const struct sk_buff *, skb, const void *,
> > data, int, headlen, int, offset)
> > {
> > - u8 tmp, *ptr;
> > + u8 tmp;
> > const int len = sizeof(tmp);
> >
> > - if (offset >= 0) {
> > - if (headlen - offset >= len)
> > - return *(u8 *)(data + offset);
> > - if (!skb_copy_bits(skb, offset, &tmp, sizeof(tmp)))
> > - return tmp;
> > - } else {
> > - ptr = bpf_internal_load_pointer_neg_helper(skb, offset, len);
> > - if (likely(ptr))
> > - return *(u8 *)ptr;
>
> [..]
>
> > + if (offset < 0) {
> > + if (offset >= SKF_NET_OFF)
> > + offset += skb_network_offset(skb) - SKF_NET_OFF;
> > + else if (offset >= SKF_LL_OFF && skb_mac_header_was_set(skb))
> > + offset += skb_mac_offset(skb) - SKF_LL_OFF;
> > + else
> > + return -EFAULT;
> > }
>
> nit: we now repeat the same logic three times, maybe still worth it to put it
> into a helper? bpf_resolve_classic_offset or something.
I definitely tried this in various ways. But since the core logic is
only four lines and there is an early return on error, no helper
really simplifies anything. It just adds a layer of indirection and
more code in the end.
Powered by blists - more mailing lists