lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aIvdlJts5JQLuzLE@lore-rh-laptop>
Date: Thu, 31 Jul 2025 23:18:12 +0200
From: Lorenzo Bianconi <lorenzo@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Jesper Dangaard Brouer <hawk@...nel.org>,
	Stanislav Fomichev <stfomichev@...il.com>, bpf@...r.kernel.org,
	netdev@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>,
	Daniel Borkmann <borkmann@...earbox.net>,
	Eric Dumazet <eric.dumazet@...il.com>,
	"David S. Miller" <davem@...emloft.net>,
	Paolo Abeni <pabeni@...hat.com>, sdf@...ichev.me,
	kernel-team@...udflare.com, arthur@...hurfabre.com,
	jakub@...udflare.com, Jesse Brandeburg <jbrandeburg@...udflare.com>
Subject: Re: [PATCH bpf-next V2 0/7] xdp: Allow BPF to set RX hints for
 XDP_REDIRECTed packets

On Jul 28, Jakub Kicinski wrote:
> On Mon, 28 Jul 2025 12:53:01 +0200 Lorenzo Bianconi wrote:
> > > > I can see why you might think that, but from my perspective, the
> > > > xdp_frame *is* the implementation of the mini-SKB concept. We've been
> > > > building it incrementally for years. It started as the most minimal
> > > > structure possible and has gradually gained more context (e.g. dev_rx,
> > > > mem_info/rxq_info, flags, and also uses skb_shared_info with same layout
> > > > as SKB).  
> > > 
> > > My understanding was that just adding all the fields to xdp_frame was
> > > considered too wasteful. Otherwise we would have done something along
> > > those lines ~10 years ago :S  
> > 
> > Hi Jakub,
> > 
> > sorry for the late reply.
> > I am completely fine to redesign the solution to overcome the problem but I
> > guess this feature will allow us to improve XDP performance in a common/real
> > use-case. Let's consider we want to redirect a packet into a veth and then into
> > a container. Preserving the hw metadata performing XDP_REDIRECT will allow us
> > to avoid recalculating the checksum creating the skb. This will result in a
> > very nice performance improvement.
> > So I guess we should really come up with some idea to add this missing feature.
> 
> I don't think the counter-proposal prevents that. As long as veth
> supports "set" callbacks the program can transfer the metadata over
> to the veth and the second program at veth can communicate them to 
> the driver.

IIUC the 'set' proposal (please correct me if I am wrong), the eBPF program
running on the NIC that is receiving the packet from the wire is supposed
to set (or update) the hw metadata info (e.g. RX HASH or RX checksum) in
the RX DMA descriptor associated to the packet to be successively consumed.
Am I right?
I think this approach works fine if the SKB is created locally in the NAPI
loop of the receiving driver (e.g if the eBPF program bounded on the NIC is
returning XDP_PASS) but I guess it does not work if the packet is redirected
into a remote CPU or a remote device (e.g. veth). Considering the veth
use-case, veth_ndo_xdp_xmit() enqueues the packet into a ptr_ring and
schedule a NAPI. When the NAPI runs I guess the DMA descriptor originally
associated to the packet has been already queued back to the hw ring to be
consumed for a following packet. In order to be able to easily consume
these hw metadata I guess we should store these info in the same packet
buffer. Am I missing something?

Regards,
Lorenzo

> 
> Martin mentioned to me that he had proposed in the past that we allow
> allocating the skb at the XDP level, if the program needs "skb-level
> metadata". That actually seems pretty clean to me.. Was it ever
> explored?

Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ