[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoB5Y1YiQjbdzN4FdxGBf4=1neLMUvyZRPL55752GHyKPQ@mail.gmail.com>
Date: Tue, 9 Dec 2025 22:19:50 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: bot+bpf-ci@...nel.org
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, bjorn@...nel.org, magnus.karlsson@...el.com,
maciej.fijalkowski@...el.com, jonathan.lemon@...il.com, sdf@...ichev.me,
ast@...nel.org, daniel@...earbox.net, hawk@...nel.org,
john.fastabend@...il.com, bpf@...r.kernel.org, netdev@...r.kernel.org,
kernelxing@...cent.com, andrii@...nel.org, martin.lau@...nel.org,
eddyz87@...il.com, yonghong.song@...ux.dev, clm@...a.com,
ihor.solodrai@...ux.dev
Subject: Re: [PATCH bpf-next v1 1/2] xsk: introduce local_cq for each af_xdp socket
On Tue, Dec 9, 2025 at 5:29 PM <bot+bpf-ci@...nel.org> wrote:
>
> > diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
> > index 23e8861e8b25..c53ab2609d8c 100644
> > --- a/include/net/xdp_sock.h
> > +++ b/include/net/xdp_sock.h
> > @@ -45,6 +45,12 @@ struct xsk_map {
> > struct xdp_sock __rcu *xsk_map[];
> > };
> >
> > +struct local_cq {
> > + u32 prod ____cacheline_aligned_in_smp;
> > + u32 ring_mask ____cacheline_aligned_in_smp;
> > + u64 desc[] ____cacheline_aligned_in_smp;
> > +};
>
> The prod field is not initialized in xsk_init_local_cq(). Since vmalloc()
> does not zero memory, prod will contain garbage values when first used.
> The next commit in this series (722af5ef5295 "xsk: introduce a dedicated
> local completion queue for each xsk") uses prod with lcq->prod++ before
> any initialization occurs.
>
> Should prod be initialized to 0 in xsk_init_local_cq()?
Yes, it can be done here. Will fix it. Thanks.
>
> > +
> > struct xdp_sock {
> > /* struct sock must be the first member of struct xdp_sock */
> > struct sock sk;
>
> [ ... ]
>
> > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> > index f093c3453f64..ce165d093105 100644
> > --- a/net/xdp/xsk.c
> > +++ b/net/xdp/xsk.c
>
> [ ... ]
>
> > +static int xsk_init_local_cq(struct xdp_sock *xs)
> > +{
> > + u32 nentries = xs->pool->cq->nentries;
> > + size_t size = struct_size_t(struct local_cq, desc, nentries);
> > +
> > + xs->lcq = vmalloc(size);
> > + if (!xs->lcq)
> > + return -ENOMEM;
> > + xs->lcq->ring_mask = nentries - 1;
> > +
> > + return 0;
> > +}
>
> Can xs->pool->cq be NULL here? In the XDP_SHARED_UMEM path with
> different queue_id or device (xsk_bind()->xp_create_and_assign_umem()->
> xsk_init_local_cq()), xp_create_and_assign_umem() sets pool->cq =
> xs->cq_tmp, but there's no validation that xs->cq_tmp is non-NULL
> before calling xsk_init_local_cq().
Yes, it can happen here. But it should not happen in theory. I did a
quick analysis at the following link:
https://lore.kernel.org/all/CAL+tcoDQ6MyuGRE8mJi_cafqyO0wfgOw5WTqnCvKGbQBhKOGpA@mail.gmail.com/.
I felt tired and sleepy right now and will dig deeper into this
tomorrow. Hopefully I'm right.
My take is this NULL pointer of cq case should be avoided.
Thanks,
Jason
>
> For comparison, the non-shared umem path validates cq_tmp via
> xsk_validate_queues() before reaching xsk_init_local_cq(). Should the
> XDP_SHARED_UMEM path with different qid/dev also validate that cq_tmp
> is set before calling xp_create_and_assign_umem()?
>
> > +
> > +static void xsk_clear_local_cq(struct xdp_sock *xs)
> > +{
> > + vfree(xs->lcq);
> > +}
>
> [ ... ]
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20058064407
Powered by blists - more mailing lists