lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250630-skb-metadata-thru-dynptr-v1-0-f17da13625d8@cloudflare.com>
Date: Mon, 30 Jun 2025 16:55:33 +0200
From: Jakub Sitnicki <jakub@...udflare.com>
To: bpf@...r.kernel.org
Cc: Alexei Starovoitov <ast@...nel.org>, 
 Arthur Fabre <arthur@...hurfabre.com>, Eric Dumazet <edumazet@...gle.com>, 
 Jakub Kicinski <kuba@...nel.org>, Jesper Dangaard Brouer <hawk@...nel.org>, 
 Jesse Brandeburg <jbrandeburg@...udflare.com>, 
 Joanne Koong <joannelkoong@...il.com>, 
 Lorenzo Bianconi <lorenzo@...nel.org>, 
 Toke Høiland-Jørgensen <thoiland@...hat.com>, 
 Yan Zhai <yan@...udflare.com>, netdev@...r.kernel.org, 
 kernel-team@...udflare.com, Stanislav Fomichev <sdf@...ichev.me>
Subject: [PATCH bpf-next 00/13] Extend skb dynptr for metadata access from
 TC

TL;DR
-----

This is the first step in an effort which aims to enable skb metadata
access for all BPF programs which operate on an skb context.

By skb metadata we mean the custom metadata area which can be allocated
from an XDP program with the bpf_xdp_adjust_meta helper. Network stack code
accesses it using the skb_metadata_* helpers.

Overview
--------

Today, the skb metadata is accessible only by the BPF TC ingress programs
through the __sk_buff->data_meta pointer. We propose a three step plan to
make skb metadata available to all other BPF programs which operate on skb
objects:

 1) Extend skb dynptr for metadata access from TC (this patch set)

    This is a preparatory step, but it also stands on its own. Here we
    enable access to the skb metadata through a bpf_dynptr, the same way we
    can already access the skb payload today.

    In the next step (2) we plan to relocate the metadata as skb travels
    through the network stack. That will require a safe way to access the
    metadata area irrespective of its location.

    The checks relying on pointer arithmetic - __sk_buff->data_meta and
    ->data - were not built for that. They require the metadata to be
    located right in front of the payload. Otherwise their guarantees break
    down.

    This is where the dynptr [1] comes into play. It solves exactly that
    problem. The dynptr to skb metadata can be backed by a memory area that
    resides in a different location depending on code path.

 2) Persist skb metadata past the TC hook (future)

    Keeping the metadata in front of the packet headers as the skb travels
    through the network stack is problematic - see the discussion of
    alternative approaches below. Hence, we plan to relocate as necessary
    after the TC hook.

    Where to? We don't know yet. There are a couple of options: (i) move it
    to the top of skb headroom, or (ii) allocate dedicated memory for it.
    They are not mutually exclusive. The right solution might be a mix.

    When? That is also an open question. It could be done on device to
    protocol handover or lazily when headers get pushed or headroom gets
    resized.

 3) skb dynptr for sockops, sk_lookup, etc. (future)

    There are BPF program types which don't get an __sk_buff as a context,
    but they either have, or could have in some cases, access to the skb
    itself. As a final touch, we want to provide a way to create an skb
    dynptr from these special contexts.

TIMTOWDI
--------

Alternative approaches which we considered:

* Keep the metadata always in front of skb->data

We think it is a bad idea for two reasons, outlined below. Nevertheless we
are open to it if necessary.

 1) Performance concerns

    It would require the network stack to move the metadata on each header
    pull/push (see skb_reorder_vlan_header() for an example). While doable,
    there is an expected performance overhead.

 2) Potential for bugs

    In addition to updating skb_push/pull and pskp_expand_head, we would
    need to audit any code paths which operate on skb->data pointer
    directly without going through the helper. This creates a "known
    unknown" risk.

* Design a new custom metadata area from scratch

We have tried that in Arthur's patch set [2]. One of the outcomes of the
discussion there was that we don't want to have two places to store custom
metadata. Hence the change of approach.

-jkbs

PS. This series is not as long as it looks. I kept the more granular commit
split to "show the work". I can squash some together if needed.

[1] https://docs.ebpf.io/linux/concepts/dynptrs/
[2] https://lore.kernel.org/all/20250422-afabre-traits-010-rfc2-v2-0-92bcc6b146c9@arthurfabre.com/

Signed-off-by: Jakub Sitnicki <jakub@...udflare.com>
---
Jakub Sitnicki (13):
      bpf: Ignore dynptr offset in skb data access
      bpf: Helpers for skb dynptr read/write/slice
      bpf: Add new variant of skb dynptr for the metadata area
      bpf: Enable read access to skb metadata with bpf_dynptr_read
      bpf: Enable write access to skb metadata with bpf_dynptr_write
      bpf: Enable read-write access to skb metadata with dynptr slice
      net: Clear skb metadata on handover from device to protocol
      selftests/bpf: Pass just bpf_map to xdp_context_test helper
      selftests/bpf: Parametrize test_xdp_context_tuntap
      selftests/bpf: Cover read access to skb metadata via dynptr
      selftests/bpf: Cover write access to skb metadata via dynptr
      selftests/bpf: Cover lack of access to skb metadata at ip layer
      selftests/bpf: Count successful bpf program runs

 include/linux/filter.h                             |  25 ++-
 include/uapi/linux/bpf.h                           |   9 +
 kernel/bpf/helpers.c                               |  10 +-
 net/core/dev.c                                     |   1 +
 net/core/filter.c                                  | 104 +++++++++--
 tools/include/uapi/linux/bpf.h                     |   9 +
 .../bpf/prog_tests/xdp_context_test_run.c          | 194 +++++++++++++++++----
 tools/testing/selftests/bpf/progs/test_xdp_meta.c  | 171 ++++++++++++++++--
 tools/testing/selftests/bpf/test_progs.h           |   1 +
 9 files changed, 446 insertions(+), 78 deletions(-)


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ