netdev - Re: [PATCH net-next 3/3] net: increase skb_defer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iKP7X_GsuP9VcUJURC3T4Z2wJTyjf2GS0PCxZnb6APnCQ@mail.gmail.com>
Date: Fri, 7 Nov 2025 08:26:38 -0800
From: Eric Dumazet <edumazet@...gle.com>
To: Jason Xing <kerneljasonxing@...il.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, 
	Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>, 
	Kuniyuki Iwashima <kuniyu@...gle.com>, Willem de Bruijn <willemb@...gle.com>, netdev@...r.kernel.org, 
	eric.dumazet@...il.com
Subject: Re: [PATCH net-next 3/3] net: increase skb_defer_max default to 128

On Fri, Nov 7, 2025 at 8:21 AM Jason Xing <kerneljasonxing@...il.com> wrote:
>
> On Sat, Nov 8, 2025 at 12:08 AM Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > On Fri, Nov 7, 2025 at 8:04 AM Jason Xing <kerneljasonxing@...il.com> wrote:
> > >
> > > On Sat, Nov 8, 2025 at 12:00 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > > >
> > > > On Fri, Nov 7, 2025 at 7:50 AM Jason Xing <kerneljasonxing@...il.com> wrote:
> > > > >
> > > > > On Fri, Nov 7, 2025 at 11:47 PM Eric Dumazet <edumazet@...gle.com> wrote:
> > > > > >
> > > > > > On Fri, Nov 7, 2025 at 7:37 AM Jason Xing <kerneljasonxing@...il.com> wrote:
> > > > > > >
> > > > > > > On Fri, Nov 7, 2025 at 4:30 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > > > > > > >
> > > > > > > > skb_defer_max value is very conservative, and can be increased
> > > > > > > > to avoid too many calls to kick_defer_list_purge().
> > > > > > > >
> > > > > > > > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> > > > > > >
> > > > > > > I was thinking if we ought to enlarge NAPI_SKB_CACHE_SIZE() to 128 as
> > > > > > > well since the freeing skb happens in the softirq context, which I
> > > > > > > came up with when I was doing the optimization for af_xdp. That is
> > > > > > > also used to defer freeing skb to obtain some improvement in
> > > > > > > performance. I'd like to know your opinion on this, thanks in advance!
> > > > > >
> > > > > > Makes sense. I even had a patch like this in my queue ;)
> > > > >
> > > > > Great to hear that. Look forward to seeing it soon :)
> > > >
> > > > Oh please go ahead !
> > >
> > > Okay, thanks for letting me post this minor change. I just thought you
> > > wanted to do this on your own :P
> > >
> > > Will do it soon :)
> >
> > Note that I was thinking to free only 32 skbs if we fill up the array
> > completely.
> >
> > Current code frees half of it, this seems better trying to keep 96
> > skbs and free 32 of them.
> >
> > Same for the bulk alloc, we could probably go to 32 (instead of 16)
>
> Thanks for your suggestion!
>
> However, sorry, I didn't get it totally. I'm wondering what the
> difference between freeing only 32 and freeing half of the new value
> is? My thought was freeing the half, say, 128/2, which minimizes more
> times of performing skb free functions. Could you shed some light on
> those numbers?

If we free half, a subsequent net_rx_action() calling 2 or 3 times a
napi_poll() will exhaust the remaining 64.

This is a per-cpu reserve of sk_buff (256 bytes each). I think we can
afford having them in the pool just in case...

I also had a prefetch() in napi_skb_cache_get() :

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7ac5f8aa1235a55db02b40b5a0f51bb3fa53fa03..3d40c4b0c580afc183c30e2efb0f953d0d5aabf9
100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -297,6 +297,8 @@ static struct sk_buff *napi_skb_cache_get(void)
        }

        skb = nc->skb_cache[--nc->skb_count];
+       if (nc->skb_count)
+               prefetch(nc->skb_cache[nc->skb_count - 1]);
        local_unlock_nested_bh(&napi_alloc_cache.bh_lock);
        kasan_mempool_unpoison_object(skb, skbuff_cache_size);