netdev - Re: [PATCH RFC bpf-next v2 01/17] trait: limited KV store for packet metadata

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87ikmle9t4.fsf@cloudflare.com>
Date: Wed, 30 Apr 2025 21:19:51 +0200
From: Jakub Sitnicki <jakub@...udflare.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>, Toke
 Høiland-Jørgensen <toke@...hat.com>, Arthur Fabre
 <arthur@...hurfabre.com>
Cc: Network Development <netdev@...r.kernel.org>,  bpf
 <bpf@...r.kernel.org>,  Jesper Dangaard Brouer <hawk@...nel.org>,  Yan
 Zhai <yan@...udflare.com>,  jbrandeburg@...udflare.com,
  lbiancon@...hat.com,  Alexei Starovoitov <ast@...nel.org>,  Jakub
 Kicinski <kuba@...nel.org>,  Eric Dumazet <edumazet@...gle.com>,
 kernel-team@...udflare.com
Subject: Re: [PATCH RFC bpf-next v2 01/17] trait: limited KV store for
 packet metadata

On Wed, Apr 30, 2025 at 11:19 AM +02, Toke Høiland-Jørgensen wrote:
> Alexei Starovoitov <alexei.starovoitov@...il.com> writes:
>
>> On Fri, Apr 25, 2025 at 12:27 PM Arthur Fabre <arthur@...hurfabre.com> wrote:
>>>
>>> On Thu Apr 24, 2025 at 6:22 PM CEST, Alexei Starovoitov wrote:
>>> > On Tue, Apr 22, 2025 at 6:23 AM Arthur Fabre <arthur@...hurfabre.com> wrote:

[...]

>>> * Hardware metadata: metadata exposed from NICs (like the receive
>>>   timestamp, 4 tuple hash...) is currently only exposed to XDP programs
>>>   (via kfuncs).
>>>   But that doesn't expose them to the rest of the stack.
>>>   Storing them in traits would allow XDP, other BPF programs, and the
>>>   kernel to access and modify them (for example to into account
>>>   decapsulating a packet).
>>
>> Sure. If traits == existing metadata bpf prog in xdp can communicate
>> with bpf prog in skb layer via that "trait" format.
>> xdp can take tuple hash and store it as key==0 in the trait.
>> The kernel doesn't need to know how to parse that format.
>
> Yes it does, to propagate it to the skb later. I.e.,
>
> XDP prog on NIC: get HW hash, store in traits, redirect to CPUMAP
> CPUMAP: build skb, read hash from traits, populate skb hash
>
> Same thing for (at least) timestamps and checksums.
>
> Longer term, with traits available we could move more skb fields into
> traits to make struct sk_buff smaller (by moving optional fields to
> traits that don't take up any space if they're not set).

Perhaps we can have the cake and eat it too.

We could leave the traits encoding/decoding out of the kernel and, at
the same time, *expose it* to the network stack through BPF struct_ops
programs. At a high level, for example ->get_rx_hash(), not the
individual K/V access. The traits_ops vtable could grow as needed to
support new use cases.

If you think about it, it's not so different from BPF-powered congestion
algorithms and scheduler extensions. They also expose some state, kept in
maps, that only the loaded BPF code knows how to operate on.