[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251029162245.5ea2ee3e@kernel.org>
Date: Wed, 29 Oct 2025 16:22:45 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Fernando Fernandez Mancera <fmancera@...e.de>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org, magnus.karlsson@...el.com,
maciej.fijalkowski@...el.com, sdf@...ichev.me, kerneljasonxing@...il.com,
fw@...len.de
Subject: Re: [PATCH 2/2 bpf v2] xsk: avoid data corruption on cq descriptor
number
On Wed, 29 Oct 2025 08:51:58 +0100 Fernando Fernandez Mancera wrote:
> On 10/29/25 12:01 AM, Jakub Kicinski wrote:
> > On Tue, 28 Oct 2025 19:30:32 +0100 Fernando Fernandez Mancera wrote:
> >> Since commit 30f241fcf52a ("xsk: Fix immature cq descriptor
> >> production"), the descriptor number is stored in skb control block and
> >> xsk_cq_submit_addr_locked() relies on it to put the umem addrs onto
> >> pool's completion queue.
> >
> > Looking at the past discussion it sounds like you want to optimize
> > the single descriptor case? Can you not use a magic pointer for that?
> >
> > #define XSK_DESTRUCT_SINGLE_BUF (void *)1
> > destructor_arg = XSK_DESTRUCT_SINGLE_BUF
> >
> > Let's target this fix at net, please, I think the complexity here is
> > all in skbs paths.
>
> I might be missing something here but if the destructor_arg pointer is
> used to do this, where should we store the umem address associated with
> it? In the proposed approach the skb extension should not be increased
> for non-fragmented traffic as there is only a single descriptor and
> therefore we can store the umem address in destructor_arg directly.
I see. Pointers are always aligned to 8B, you can stash the "pointer
type" there. If the bottom bit is 1 it's a umem and the skb was
single-chunk. If it's non-0 then it's a full kmalloc'ed struct.
> The size of the skb extension will only increase for fragmented traffic
> (multiple descriptors).. but sure, if there is a fallback to the
> slowpath, it will burden a bit the performance. Although, for that to
> happen the must have tried to use AF_XDP family initially.. AFAICS, the
> size of skb extension is only increased when skb_ext_add() is called.
To be clear by adding an skb extension you are de-facto allocating
a bit in the skb struct. Just one of the bits of the active_extensions
field instead of a separate bitfield. If you can depend on the socket
association instead this is quite wasteful.
Powered by blists - more mailing lists