[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoD3-qtq4Kcmo9eb4mw6bdSYCCjxzNB3qov5LDYoe_gtkw@mail.gmail.com>
Date: Mon, 17 Nov 2025 09:07:34 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Kuniyuki Iwashima <kuniyu@...gle.com>, netdev@...r.kernel.org, eric.dumazet@...il.com
Subject: Re: [PATCH v3 net-next 3/3] net: use napi_skb_cache even in process context
On Mon, Nov 17, 2025 at 4:27 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> This is a followup of commit e20dfbad8aab ("net: fix napi_consume_skb()
> with alien skbs").
>
> Now the per-cpu napi_skb_cache is populated from TX completion path,
> we can make use of this cache, especially for cpus not used
> from a driver NAPI poll (primary user of napi_cache).
>
> We can use the napi_skb_cache only if current context is not from hard irq.
>
> With this patch, I consistently reach 130 Mpps on my UDP tx stress test
> and reduce SLUB spinlock contention to smaller values.
>
> Note there is still some SLUB contention for skb->head allocations.
>
> I had to tune /sys/kernel/slab/skbuff_small_head/cpu_partial
> and /sys/kernel/slab/skbuff_small_head/min_partial depending
> on the platform taxonomy.
>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
Reviewed-by: Jason Xing <kerneljasonxing@...il.com>
Thanks for working on this. Previously I was thinking about this as
well since it affects the hot path for xsk (please see
__xsk_generic_xmit()->xsk_build_skb()->sock_alloc_send_pskb()). But I
wasn't aware of the benefits between disabling irq and allocating
memory. AFAIK, I once removed an enabling/disabling irq pair and saw a
minor improvement as this commit[1] says. Would you share your
invaluable experience with us in this case?
In the meantime, I will do more rounds of experiments to see how they perform.
[1]
commit 30ed05adca4a05c50594384cff18910858dd1d35
Author: Jason Xing <kernelxing@...cent.com>
Date: Thu Oct 30 08:06:46 2025 +0800
xsk: use a smaller new lock for shared pool case
- Split cq_lock into two smaller locks: cq_prod_lock and
cq_cached_prod_lock
- Avoid disabling/enabling interrupts in the hot xmit path
Thanks,
Jason
Powered by blists - more mailing lists