[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <89fee0bdf782c9fab10781be2a42c099bfb9f99a53c342dd6e41fd99651943f6@mail.kernel.org>
Date: Tue, 6 Jan 2026 01:47:52 +0000 (UTC)
From: bot+bpf-ci@...nel.org
To: kerneljasonxing@...il.com,davem@...emloft.net,edumazet@...gle.com,kuba@...nel.org,pabeni@...hat.com,bjorn@...nel.org,magnus.karlsson@...el.com,maciej.fijalkowski@...el.com,jonathan.lemon@...il.com,sdf@...ichev.me,ast@...nel.org,daniel@...earbox.net,hawk@...nel.org,john.fastabend@...il.com
Cc: bpf@...r.kernel.org,netdev@...r.kernel.org,kernelxing@...cent.com,ast@...nel.org,andrii@...nel.org,daniel@...earbox.net,martin.lau@...nel.org,eddyz87@...il.com,yonghong.song@...ux.dev,clm@...a.com,ihor.solodrai@...ux.dev
Subject: Re: [PATCH bpf-next v4 2/2] xsk: introduce a dedicated local completion queue for each xsk
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index f41e0b480aa4..f80721165fdc 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
[ ... ]
> -static int xsk_cq_reserve_locked(struct xsk_buff_pool *pool)
> +static int xsk_cq_reserve_addr_locked(struct xdp_sock *xs, u64 addr)
> {
> + struct xsk_buff_pool *pool = xs->pool;
> + struct local_cq *lcq = xs->lcq;
> int ret;
>
> spin_lock(&pool->cq_cached_prod_lock);
> ret = xskq_prod_reserve(pool->cq);
> spin_unlock(&pool->cq_cached_prod_lock);
> + if (!ret)
> + lcq->desc[lcq->prod++ & lcq->ring_mask] = addr;
>
> return ret;
> }
[ ... ]
> +/* Record the position of first desc in local cq */
> +static void xsk_skb_destructor_set_addr(struct sk_buff *skb,
> + struct xdp_sock *xs)
> {
> - skb_shinfo(skb)->destructor_arg = (void *)((uintptr_t)addr | 0x1UL);
> + long val;
> +
> + val = ((xs->lcq->prod - 1) & xs->lcq->ring_mask) << XSK_DESTRUCTOR_DESCS_SHIFT;
> + skb_shinfo(skb)->destructor_arg = (void *)val;
> }
[ ... ]
> +static void xsk_cq_write_addr(struct sk_buff *skb, u32 desc_processed)
> +{
> + struct xsk_buff_pool *pool = xdp_sk(skb->sk)->pool;
> + u32 idx, pos = xsk_get_start_addr(skb);
> + struct xdp_sock *xs = xdp_sk(skb->sk);
> + u64 addr;
> +
> + idx = xskq_get_prod(pool->cq) + desc_processed;
> + addr = xs->lcq->desc[(pos + desc_processed) & xs->lcq->ring_mask];
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> + xskq_prod_write_addr(pool->cq, idx, addr);
> +}
Can this read stale or corrupted data from lcq->desc[] if skb destructors
run out of order?
Consider this scenario with a completion queue of 4 entries:
1. SKB1 reserves and writes to lcq[0], stores start_pos=0
2. SKB2 reserves and writes to lcq[1], stores start_pos=1
3. SKB3 reserves and writes to lcq[2], stores start_pos=2
4. SKB4 reserves and writes to lcq[3], stores start_pos=3
5. Backpressure - cq is full
6. Network completes SKB4 first (out of order), destructor reads lcq[3],
submits to cq, space becomes available
7. SKB5 reserves and writes to lcq[0] (wraparound), overwriting SKB1's data
8. SKB1's destructor finally runs, reads lcq[0] which now contains SKB5's
address instead of SKB1's original address
The old code stored addresses in a per-skb xsk_addrs structure, so each skb
had its own copy that could not be overwritten. The new shared ring buffer
design appears to be vulnerable to out-of-order completion corrupting the
address data.
> +static void xsk_cq_submit_addr_locked(struct sk_buff *skb)
> {
> - u32 num_descs = xsk_get_num_desc(skb);
> - struct xsk_addrs *xsk_addr;
> - u32 descs_processed = 0;
> + struct xsk_buff_pool *pool = xdp_sk(skb->sk)->pool;
> + u8 i, num = xsk_get_num_desc(skb);
> unsigned long flags;
> - u32 idx, i;
>
> spin_lock_irqsave(&pool->cq_prod_lock, flags);
> - idx = xskq_get_prod(pool->cq);
> -
> - if (unlikely(num_descs > 1)) {
> - xsk_addr = (struct xsk_addrs *)skb_shinfo(skb)->destructor_arg;
> -
> - for (i = 0; i < num_descs; i++) {
> - xskq_prod_write_addr(pool->cq, idx + descs_processed,
> - xsk_addr->addrs[i]);
> - descs_processed++;
> - }
> - kmem_cache_free(xsk_tx_generic_cache, xsk_addr);
> - } else {
> - xskq_prod_write_addr(pool->cq, idx,
> - xsk_skb_destructor_get_addr(skb));
> - descs_processed++;
> - }
> - xskq_prod_submit_n(pool->cq, descs_processed);
> + for (i = 0; i < num; i++)
> + xsk_cq_write_addr(skb, i);
> + xskq_prod_submit_n(pool->cq, num);
> spin_unlock_irqrestore(&pool->cq_prod_lock, flags);
> }
[ ... ]
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20735004935
Powered by blists - more mailing lists