[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89i+o6QAUXJkmVJv1HTCGxK05uGjtOT5SUF4ujZ4XCLQRXw@mail.gmail.com>
Date: Mon, 17 Nov 2025 02:19:21 -0800
From: Eric Dumazet <edumazet@...gle.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Simon Horman <horms@...nel.org>, Kuniyuki Iwashima <kuniyu@...gle.com>,
Jason Xing <kerneljasonxing@...il.com>, netdev@...r.kernel.org, eric.dumazet@...il.com
Subject: Re: [PATCH v2 net-next 3/3] net: use napi_skb_cache even in process context
On Mon, Nov 17, 2025 at 2:12 AM Paolo Abeni <pabeni@...hat.com> wrote:
>
> On 11/14/25 1:12 PM, Eric Dumazet wrote:
> > This is a followup of commit e20dfbad8aab ("net: fix napi_consume_skb()
> > with alien skbs").
> >
> > Now the per-cpu napi_skb_cache is populated from TX completion path,
> > we can make use of this cache, especially for cpus not used
> > from a driver NAPI poll (primary user of napi_cache).
> >
> > We can use the napi_skb_cache only if current context is not from hard irq.
> >
> > With this patch, I consistently reach 130 Mpps on my UDP tx stress test
> > and reduce SLUB spinlock contention to smaller values.
> >
> > Note there is still some SLUB contention for skb->head allocations.
> >
> > I had to tune /sys/kernel/slab/skbuff_small_head/cpu_partial
> > and /sys/kernel/slab/skbuff_small_head/min_partial depending
> > on the platform taxonomy.
>
> Double checking I read the above correctly: you did the tune to reduce
> the SLUB contention on skb->head and reach the 130Mpps target, am I correct?
>
> If so, could you please share the used values for future memory?
>
Note that skbuff_small_head is mostly used by TCP tx packets, incoming
GRO packets (where all payload is in page frags)
and small UDP packets (my benchmark)
On an AMD Turin host, and IDPF nic (which unfortunately limits each
napi poll TX completions to 256 packets),
i had to change them to :
echo 80 >/sys/kernel/slab/skbuff_small_head/cpu_partial
echo 45 >/sys/kernel/slab/skbuff_small_head/min_partial
An increase to 100, 80 was also showing benefits.
It is very possible recent SLUB sheaves could help, I was enable to
test this yet because IDPF in upstream kernels
just does not work on my lab hosts (something probably caused by our
own firmware code)
Anyone has a very fast NIC to test if we can leverage SLUB sheaves on
some critical skb caches ?
Powered by blists - more mailing lists