lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250422-afabre-traits-010-rfc2-v2-0-92bcc6b146c9@arthurfabre.com>
Date: Tue, 22 Apr 2025 15:23:29 +0200
From: Arthur Fabre <arthur@...hurfabre.com>
To: netdev@...r.kernel.org, bpf@...r.kernel.org
Cc: jakub@...udflare.com, hawk@...nel.org, yan@...udflare.com, 
 jbrandeburg@...udflare.com, thoiland@...hat.com, lbiancon@...hat.com, 
 ast@...nel.org, kuba@...nel.org, edumazet@...gle.com, 
 Arthur Fabre <arthur@...hurfabre.com>
Subject: [PATCH RFC bpf-next v2 00/17] traits: Per packet metadata KV store

The only way to attach information to a sk_buff that travels
through the network stack is with the mark. This field can be
read in firewall rules, drive routing decisions, and be
accessed by BPF programs.

However, its small size creates competition for bits, restricting
its practical use.

We propose using part of the packet headroom to store metadata.
This would allow:
- Tracing packets through the network stack and across the kernel-user
  space boundary, by assigning them a unique ID.
- Metadata-driven packet redirection, routing, and socket steering with
  early classification in XDP.
- Extracting information from encapsulation headers and sharing it with
  user space or vice versa.
- Exposing XDP RX Metadata, like the timestamp, to the rest of the
  network stack.

We originally proposed extending XDP metadata - binary blob
storage also in the headroom - to expose it throughout the network
stack. However based on feedback at LPC 2024 [1]:
- sharing a binary blob amongst different applications is hard.
- exposing a binary blob to userspace is awkward.
we've shifted to a limited KV store in the headroom.

To differentiate this from the overloaded "metadata" term, it's
tentatively called "packet traits".

Traits are currently stored at the start of the headroom:

| xdp_frame | traits | headroom | XDP metadata | data / packet |

This makes adding encap headers to a packet easier: the traits don't
have to be moved out of the way first.

But to let us change this in the future, XDP metadata and traits
aren't allowed to be used together.

A get() / set() / delete() API is exposed to BPF to store and
retrieve traits.

Initial benchmarks in XDP are promising, with get() / set() comparable
to an indirect function call. See patch 7: "trait: Replace memmove calls
with inline move" for full results.

We imagine adding first class support for this in netfilter (setting
/ checking traits in rules) and routing (selecting routing tables
based on traits) in follow up work.
We also envisage a first class userspace API for storing and
retrieving traits in the future.

Like XDP metadata, this relies on there being sufficient headroom
available. Piggy backing on top of that work, traits are currently
only supported:
- On ingress.
- By NIC drivers that support XDP metadata.
- When an XDP program is attached.
This limits the applicability of traits. But future work
guaranteeing sufficient headroom through other means should allow
these restrictions to be lifted.

[1] https://lpc.events/event/18/contributions/1935/

---
Changes in v2:
- Support sizes 0 (for flags), 4, and 8. 16 will be supported in the
  future with a batch API, to set two consecutive 8 byte KVs at once.
- Prevent traits and XDP metadata from being used at the same time.
  This will let us move trait storage where XDP metadata is today if
  we want to.
- Use SKB extensions to store the traits in skbs.
- Drop registration API.
- Link to v1: https://lore.kernel.org/r/20250305-afabre-traits-010-rfc2-v1-0-d0ecfb869797@cloudflare.com

---
Arthur Fabre (16):
      trait: limited KV store for packet metadata
      xdp: Track if metadata is supported in xdp_frame <> xdp_buff conversions
      trait: XDP support
      trait: XDP selftest
      trait: XDP benchmark
      trait: Replace memcpy calls with inline copies
      trait: Replace memmove calls with inline move
      skb: Extension header in packet headroom
      trait: Store traits in sk_buff extension
      bnxt: Propagate trait presence to skb
      ice: Propagate trait presence to skb
      veth: Propagate trait presence to skb
      virtio_net: Propagate trait presence to skb
      mlx5: Propagate trait presence to skb
      xdp generic: Propagate trait presence to skb
      trait: Allow socket filters to access traits

Jesper Dangaard Brouer (1):
      mlx5: move xdp_buff scope one level up

 drivers/net/ethernet/broadcom/bnxt/bnxt.c          |   4 +
 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c     |   5 -
 drivers/net/ethernet/intel/ice/ice_txrx.c          |   4 +
 drivers/net/ethernet/intel/ice/ice_xsk.c           |   2 +
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |   6 +-
 .../net/ethernet/mellanox/mlx5/core/en/xsk/rx.c    |   6 +-
 .../net/ethernet/mellanox/mlx5/core/en/xsk/rx.h    |   6 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    | 114 ++++----
 drivers/net/veth.c                                 |   4 +
 drivers/net/virtio_net.c                           |   8 +-
 include/linux/skbuff.h                             |  42 +++
 include/net/trait.h                                | 302 +++++++++++++++++++++
 include/net/xdp.h                                  |  56 +++-
 net/core/dev.c                                     |   1 +
 net/core/filter.c                                  |  10 +-
 net/core/skbuff.c                                  | 231 ++++++++++++++--
 net/core/xdp.c                                     |  69 ++++-
 net/xdp/xsk.c                                      |  11 +-
 tools/testing/selftests/bpf/Makefile               |   2 +
 tools/testing/selftests/bpf/bench.c                |   8 +
 .../selftests/bpf/benchs/bench_xdp_traits.c        | 160 +++++++++++
 .../testing/selftests/bpf/prog_tests/xdp_traits.c  |  33 +++
 .../testing/selftests/bpf/progs/bench_xdp_traits.c | 128 +++++++++
 .../testing/selftests/bpf/progs/test_xdp_traits.c  | 206 ++++++++++++++
 24 files changed, 1319 insertions(+), 99 deletions(-)
---
base-commit: 5709be4c35ba760b001733939e20069de033a697
change-id: 20250305-afabre-traits-010-rfc2-a8e4de0c490b

Best regards,
-- 
Arthur Fabre <arthur@...hurfabre.com>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ