[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoCx23rAbzFTh=78mdC_7a_D-XLvi8yvZkc9Sbj74MANcw@mail.gmail.com>
Date: Tue, 4 Nov 2025 07:26:14 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, bjorn@...nel.org, magnus.karlsson@...el.com,
jonathan.lemon@...il.com, sdf@...ichev.me, ast@...nel.org,
daniel@...earbox.net, hawk@...nel.org, john.fastabend@...il.com,
horms@...nel.org, andrew+netdev@...n.ch, bpf@...r.kernel.org,
netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next v2 2/2] xsk: use a smaller new lock for shared
pool case
On Mon, Nov 3, 2025 at 10:58 PM Maciej Fijalkowski
<maciej.fijalkowski@...el.com> wrote:
>
> On Thu, Oct 30, 2025 at 08:06:46AM +0800, Jason Xing wrote:
> > From: Jason Xing <kernelxing@...cent.com>
> >
> > - Split cq_lock into two smaller locks: cq_prod_lock and
> > cq_cached_prod_lock
> > - Avoid disabling/enabling interrupts in the hot xmit path
> >
> > In either xsk_cq_cancel_locked() or xsk_cq_reserve_locked() function,
> > the race condition is only between multiple xsks sharing the same
> > pool. They are all in the process context rather than interrupt context,
> > so now the small lock named cq_cached_prod_lock can be used without
> > handling interrupts.
> >
> > While cq_cached_prod_lock ensures the exclusive modification of
> > @cached_prod, cq_prod_lock in xsk_cq_submit_addr_locked() only cares
> > about @producer and corresponding @desc. Both of them don't necessarily
> > be consistent with @cached_prod protected by cq_cached_prod_lock.
> > That's the reason why the previous big lock can be split into two
> > smaller ones. Please note that SPSC rule is all about the global state
> > of producer and consumer that can affect both layers instead of local
> > or cached ones.
> >
> > Frequently disabling and enabling interrupt are very time consuming
> > in some cases, especially in a per-descriptor granularity, which now
> > can be avoided after this optimization, even when the pool is shared by
> > multiple xsks.
> >
> > With this patch, the performance number[1] could go from 1,872,565 pps
> > to 1,961,009 pps. It's a minor rise of around 5%.
> >
> > [1]: taskset -c 1 ./xdpsock -i enp2s0f1 -q 0 -t -S -s 64
> >
> > Signed-off-by: Jason Xing <kernelxing@...cent.com>
> > ---
> > include/net/xsk_buff_pool.h | 13 +++++++++----
> > net/xdp/xsk.c | 15 ++++++---------
> > net/xdp/xsk_buff_pool.c | 3 ++-
> > 3 files changed, 17 insertions(+), 14 deletions(-)
> >
> > diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h
> > index cac56e6b0869..92a2358c6ce3 100644
> > --- a/include/net/xsk_buff_pool.h
> > +++ b/include/net/xsk_buff_pool.h
> > @@ -85,11 +85,16 @@ struct xsk_buff_pool {
> > bool unaligned;
> > bool tx_sw_csum;
> > void *addrs;
> > - /* Mutual exclusion of the completion ring in the SKB mode. Two cases to protect:
> > - * NAPI TX thread and sendmsg error paths in the SKB destructor callback and when
> > - * sockets share a single cq when the same netdev and queue id is shared.
> > + /* Mutual exclusion of the completion ring in the SKB mode.
> > + * Protect: NAPI TX thread and sendmsg error paths in the SKB
> > + * destructor callback.
> > */
> > - spinlock_t cq_lock;
> > + spinlock_t cq_prod_lock;
> > + /* Mutual exclusion of the completion ring in the SKB mode.
> > + * Protect: when sockets share a single cq when the same netdev
> > + * and queue id is shared.
> > + */
> > + spinlock_t cq_cached_prod_lock;
>
> Nice that existing hole is utilized here.
>
> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
Thanks for the review :)
Thanks,
Jason
Powered by blists - more mailing lists