lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <875xik7gsk.fsf@toke.dk>
Date: Thu, 01 May 2025 12:43:07 +0200
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Jakub Sitnicki <jakub@...udflare.com>, Alexei Starovoitov
 <alexei.starovoitov@...il.com>, Arthur Fabre <arthur@...hurfabre.com>
Cc: Network Development <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
 Jesper Dangaard Brouer <hawk@...nel.org>, Yan Zhai <yan@...udflare.com>,
 jbrandeburg@...udflare.com, lbiancon@...hat.com, Alexei Starovoitov
 <ast@...nel.org>, Jakub Kicinski <kuba@...nel.org>, Eric Dumazet
 <edumazet@...gle.com>, kernel-team@...udflare.com
Subject: Re: [PATCH RFC bpf-next v2 01/17] trait: limited KV store for
 packet metadata

Jakub Sitnicki <jakub@...udflare.com> writes:

> On Wed, Apr 30, 2025 at 11:19 AM +02, Toke Høiland-Jørgensen wrote:
>> Alexei Starovoitov <alexei.starovoitov@...il.com> writes:
>>
>>> On Fri, Apr 25, 2025 at 12:27 PM Arthur Fabre <arthur@...hurfabre.com> wrote:
>>>>
>>>> On Thu Apr 24, 2025 at 6:22 PM CEST, Alexei Starovoitov wrote:
>>>> > On Tue, Apr 22, 2025 at 6:23 AM Arthur Fabre <arthur@...hurfabre.com> wrote:
>
> [...]
>
>>>> * Hardware metadata: metadata exposed from NICs (like the receive
>>>>   timestamp, 4 tuple hash...) is currently only exposed to XDP programs
>>>>   (via kfuncs).
>>>>   But that doesn't expose them to the rest of the stack.
>>>>   Storing them in traits would allow XDP, other BPF programs, and the
>>>>   kernel to access and modify them (for example to into account
>>>>   decapsulating a packet).
>>>
>>> Sure. If traits == existing metadata bpf prog in xdp can communicate
>>> with bpf prog in skb layer via that "trait" format.
>>> xdp can take tuple hash and store it as key==0 in the trait.
>>> The kernel doesn't need to know how to parse that format.
>>
>> Yes it does, to propagate it to the skb later. I.e.,
>>
>> XDP prog on NIC: get HW hash, store in traits, redirect to CPUMAP
>> CPUMAP: build skb, read hash from traits, populate skb hash
>>
>> Same thing for (at least) timestamps and checksums.
>>
>> Longer term, with traits available we could move more skb fields into
>> traits to make struct sk_buff smaller (by moving optional fields to
>> traits that don't take up any space if they're not set).
>
> Perhaps we can have the cake and eat it too.
>
> We could leave the traits encoding/decoding out of the kernel and, at
> the same time, *expose it* to the network stack through BPF struct_ops
> programs. At a high level, for example ->get_rx_hash(), not the
> individual K/V access. The traits_ops vtable could grow as needed to
> support new use cases.
>
> If you think about it, it's not so different from BPF-powered congestion
> algorithms and scheduler extensions. They also expose some state, kept in
> maps, that only the loaded BPF code knows how to operate on.

Right, the difference being that the kernel works perfectly well without
an eBPF congestion control algorithm loaded because it has its own
internal implementation that is used by default.

Having a hard dependency on BPF for in-kernel functionality is a
different matter, and limits the cases it can be used for.

Besides, I don't really see the point of leaving the encoding out of the
kernel? We keep the encoding kernel-internal anyway, and just expose a
get/set API, so there's no constraint on changing it later (that's kinda
the whole point of doing that). And with bulk get/set there's not an
efficiency argument either. So what's the point, other than doing things
in BPF for its own sake?

-Toke


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ