lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250717182534.4f305f8a@kernel.org>
Date: Thu, 17 Jul 2025 18:25:34 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Jesper Dangaard Brouer <hawk@...nel.org>
Cc: Lorenzo Bianconi <lorenzo@...nel.org>, Stanislav Fomichev
 <stfomichev@...il.com>, bpf@...r.kernel.org, netdev@...r.kernel.org, Alexei
 Starovoitov <ast@...nel.org>, Daniel Borkmann <borkmann@...earbox.net>,
 Eric Dumazet <eric.dumazet@...il.com>, "David S. Miller"
 <davem@...emloft.net>, Paolo Abeni <pabeni@...hat.com>, sdf@...ichev.me,
 kernel-team@...udflare.com, arthur@...hurfabre.com, jakub@...udflare.com,
 Jesse Brandeburg <jbrandeburg@...udflare.com>
Subject: Re: [PATCH bpf-next V2 0/7] xdp: Allow BPF to set RX hints for
 XDP_REDIRECTed packets

On Thu, 17 Jul 2025 15:08:49 +0200 Jesper Dangaard Brouer wrote:
> Let me explain why it is a bad idea of writing into the RX descriptors.
> The DMA descriptors are allocated as coherent DMA (dma_alloc_coherent).
> This is memory that is shared with the NIC hardware device, which
> implies cache-line coherence.  NIC performance is tightly coupled to
> limiting cache misses for descriptors.  One common trick is to pack more
> descriptors into a single cache-line.  Thus, if we start to write into
> the current RX-descriptor, then we invalidate that cache-line seen from
> the device, and next RX-descriptor (from this cache-line) will be in an
> unfortunate coherent state.  Behind the scene this might lead to some
> extra PCIe transactions.
> 
> By writing to the xdp_frame, we don't have to modify the DMA descriptors
> directly and risk invalidating cache lines for the NIC.

I thought you main use case is redirected packets. In which case it's
the _remote_ end that's writing its metadata, if it's veth it's
obviously not going to be doing it into DMA coherent memory.

The metadata travels between the source and destination in program-
-defined format.

> Thanks for the feedback. I can see why you'd be concerned about adding
> another adhoc scheme or making xdp_frame grow into a "para-skb".
> 
> However, I'd like to frame this as part of a long-term plan we've been
> calling the "mini-SKB" concept. This isn't a new idea, but a
> continuation of architectural discussions from as far back as [2016].

My understanding is that while this was floated as a plan by some,
nobody came up with a clean way of implementing it.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ