lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b23ed0e2-05cf-454b-bf7a-a637c9bb48e8@kernel.org>
Date: Tue, 29 Jul 2025 13:15:53 +0200
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>, Lorenzo Bianconi <lorenzo@...nel.org>
Cc: Stanislav Fomichev <stfomichev@...il.com>, bpf@...r.kernel.org,
 netdev@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>,
 Daniel Borkmann <borkmann@...earbox.net>,
 Eric Dumazet <eric.dumazet@...il.com>, "David S. Miller"
 <davem@...emloft.net>, Paolo Abeni <pabeni@...hat.com>, sdf@...ichev.me,
 kernel-team@...udflare.com, arthur@...hurfabre.com, jakub@...udflare.com,
 Jesse Brandeburg <jbrandeburg@...udflare.com>,
 Andrew Rzeznik <arzeznik@...udflare.com>
Subject: Re: [PATCH bpf-next V2 0/7] xdp: Allow BPF to set RX hints for
 XDP_REDIRECTed packets



On 28/07/2025 18.29, Jakub Kicinski wrote:
> On Mon, 28 Jul 2025 12:53:01 +0200 Lorenzo Bianconi wrote:
>>>> I can see why you might think that, but from my perspective, the
>>>> xdp_frame *is* the implementation of the mini-SKB concept. We've been
>>>> building it incrementally for years. It started as the most minimal
>>>> structure possible and has gradually gained more context (e.g. dev_rx,
>>>> mem_info/rxq_info, flags, and also uses skb_shared_info with same layout
>>>> as SKB).
>>>
>>> My understanding was that just adding all the fields to xdp_frame was
>>> considered too wasteful. Otherwise we would have done something along
>>> those lines ~10 years ago :S
>>
>> Hi Jakub,
>>
>> sorry for the late reply.

Same, back from vacation.

>> I am completely fine to redesign the solution to overcome the problem but I
>> guess this feature will allow us to improve XDP performance in a common/real
>> use-case. Let's consider we want to redirect a packet into a veth and then into
>> a container. Preserving the hw metadata performing XDP_REDIRECT will allow us
>> to avoid recalculating the checksum creating the skb. This will result in a
>> very nice performance improvement.
>> So I guess we should really come up with some idea to add this missing feature.
> 
> 
> Martin mentioned to me that he had proposed in the past that we allow
> allocating the skb at the XDP level, if the program needs "skb-level
> metadata". That actually seems pretty clean to me.. Was it ever
> explored?

That idea has been considered before, but it unfortunately doesn't work
from a performance angle. The performance model of XDP_REDIRECT into
CPUMAP relies on moving the expensive SKB allocation+init to a remote
CPU. This keeps the ingress CPU free to process packets at near line
rate (our DDoS use-case). If we allocate the SKB on the ingress-CPU
before the redirect, we destroy this load-balancing model and create the
exact bottleneck we designed CPUMAP to avoid.

To bring the focus back to the specific problem this series solves,
let's review the concrete use case. Our IPsec scenario is a key example:
on the ingress CPU, an XDP program calculates a hash from inner packet
headers to load-balance traffic via CPUMAP. When the packet arrives on
the remote CPU, this hash is lost, so the new SKB is created with a hash
of zero. This, in turn, causes poor load-balancing when the packet is
forwarded to a multi-queue device like veth, as traffic often collapses
to a single queue. The purpose of this patchset is simply to provide a
standard way to carry that hash to the remote CPU within the xdp_frame.
(Same goes for a standard way to carry VLAN tags)

Given this specific problem, is there a better approach to solving it
than what this patchset proposes?

--Jesper

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ