netdev - Re: [PATCH net-next v2 2/2] xsk: use a smaller new lock for shared pool case

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAL+tcoCx23rAbzFTh=78mdC_7a_D-XLvi8yvZkc9Sbj74MANcw@mail.gmail.com>
Date: Tue, 4 Nov 2025 07:26:14 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, 
	pabeni@...hat.com, bjorn@...nel.org, magnus.karlsson@...el.com, 
	jonathan.lemon@...il.com, sdf@...ichev.me, ast@...nel.org, 
	daniel@...earbox.net, hawk@...nel.org, john.fastabend@...il.com, 
	horms@...nel.org, andrew+netdev@...n.ch, bpf@...r.kernel.org, 
	netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next v2 2/2] xsk: use a smaller new lock for shared
 pool case

On Mon, Nov 3, 2025 at 10:58 PM Maciej Fijalkowski
<maciej.fijalkowski@...el.com> wrote:
>
> On Thu, Oct 30, 2025 at 08:06:46AM +0800, Jason Xing wrote:
> > From: Jason Xing <kernelxing@...cent.com>
> >
> > - Split cq_lock into two smaller locks: cq_prod_lock and
> >   cq_cached_prod_lock
> > - Avoid disabling/enabling interrupts in the hot xmit path
> >
> > In either xsk_cq_cancel_locked() or xsk_cq_reserve_locked() function,
> > the race condition is only between multiple xsks sharing the same
> > pool. They are all in the process context rather than interrupt context,
> > so now the small lock named cq_cached_prod_lock can be used without
> > handling interrupts.
> >
> > While cq_cached_prod_lock ensures the exclusive modification of
> > @cached_prod, cq_prod_lock in xsk_cq_submit_addr_locked() only cares
> > about @producer and corresponding @desc. Both of them don't necessarily
> > be consistent with @cached_prod protected by cq_cached_prod_lock.
> > That's the reason why the previous big lock can be split into two
> > smaller ones. Please note that SPSC rule is all about the global state
> > of producer and consumer that can affect both layers instead of local
> > or cached ones.
> >
> > Frequently disabling and enabling interrupt are very time consuming
> > in some cases, especially in a per-descriptor granularity, which now
> > can be avoided after this optimization, even when the pool is shared by
> > multiple xsks.
> >
> > With this patch, the performance number[1] could go from 1,872,565 pps
> > to 1,961,009 pps. It's a minor rise of around 5%.
> >
> > [1]: taskset -c 1 ./xdpsock -i enp2s0f1 -q 0 -t -S -s 64
> >
> > Signed-off-by: Jason Xing <kernelxing@...cent.com>
> > ---
> >  include/net/xsk_buff_pool.h | 13 +++++++++----
> >  net/xdp/xsk.c               | 15 ++++++---------
> >  net/xdp/xsk_buff_pool.c     |  3 ++-
> >  3 files changed, 17 insertions(+), 14 deletions(-)
> >
> > diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h
> > index cac56e6b0869..92a2358c6ce3 100644
> > --- a/include/net/xsk_buff_pool.h
> > +++ b/include/net/xsk_buff_pool.h
> > @@ -85,11 +85,16 @@ struct xsk_buff_pool {
> >       bool unaligned;
> >       bool tx_sw_csum;
> >       void *addrs;
> > -     /* Mutual exclusion of the completion ring in the SKB mode. Two cases to protect:
> > -      * NAPI TX thread and sendmsg error paths in the SKB destructor callback and when
> > -      * sockets share a single cq when the same netdev and queue id is shared.
> > +     /* Mutual exclusion of the completion ring in the SKB mode.
> > +      * Protect: NAPI TX thread and sendmsg error paths in the SKB
> > +      * destructor callback.
> >        */
> > -     spinlock_t cq_lock;
> > +     spinlock_t cq_prod_lock;
> > +     /* Mutual exclusion of the completion ring in the SKB mode.
> > +      * Protect: when sockets share a single cq when the same netdev
> > +      * and queue id is shared.
> > +      */
> > +     spinlock_t cq_cached_prod_lock;
>
> Nice that existing hole is utilized here.
>
> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@...el.com>

Thanks for the review :)

Thanks,
Jason