[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d97b6ec6-59fb-4123-9d96-27b9b32dc5cc@redhat.com>
Date: Thu, 9 Jan 2025 14:16:22 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Alexander Lobakin <aleksander.lobakin@...el.com>,
Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>
Cc: Lorenzo Bianconi <lorenzo@...nel.org>, Daniel Xu <dxu@...uu.xyz>,
Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
John Fastabend <john.fastabend@...il.com>,
Toke Høiland-Jørgensen <toke@...nel.org>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>, netdev@...r.kernel.org,
bpf@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v2 5/8] net: skbuff: introduce
napi_skb_cache_get_bulk()
On 1/7/25 4:29 PM, Alexander Lobakin wrote:
> Add a function to get an array of skbs from the NAPI percpu cache.
> It's supposed to be a drop-in replacement for
> kmem_cache_alloc_bulk(skbuff_head_cache, GFP_ATOMIC) and
> xdp_alloc_skb_bulk(GFP_ATOMIC). The difference (apart from the
> requirement to call it only from the BH) is that it tries to use
> as many NAPI cache entries for skbs as possible, and allocate new
> ones only if needed.
>
> The logic is as follows:
>
> * there is enough skbs in the cache: decache them and return to the
> caller;
> * not enough: try refilling the cache first. If there is now enough
> skbs, return;
> * still not enough: try allocating skbs directly to the output array
> with %GFP_ZERO, maybe we'll be able to get some. If there's now
> enough, return;
> * still not enough: return as many as we were able to obtain.
>
> Most of times, if called from the NAPI polling loop, the first one will
> be true, sometimes (rarely) the second one. The third and the fourth --
> only under heavy memory pressure.
> It can save significant amounts of CPU cycles if there are GRO cycles
> and/or Tx completion cycles (anything that descends to
> napi_skb_cache_put()) happening on this CPU.
>
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@...el.com>
> Tested-by: Daniel Xu <dxu@...uu.xyz>
> ---
> include/linux/skbuff.h | 1 +
> net/core/skbuff.c | 62 ++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 63 insertions(+)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index bb2b751d274a..1c089c7c14e1 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -1315,6 +1315,7 @@ struct sk_buff *build_skb_around(struct sk_buff *skb,
> void *data, unsigned int frag_size);
> void skb_attempt_defer_free(struct sk_buff *skb);
>
> +u32 napi_skb_cache_get_bulk(void **skbs, u32 n);
> struct sk_buff *napi_build_skb(void *data, unsigned int frag_size);
> struct sk_buff *slab_build_skb(void *data);
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index a441613a1e6c..42eb31dcc9ce 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -367,6 +367,68 @@ static struct sk_buff *napi_skb_cache_get(void)
> return skb;
> }
>
> +/**
> + * napi_skb_cache_get_bulk - obtain a number of zeroed skb heads from the cache
> + * @skbs: pointer to an at least @n-sized array to fill with skb pointers
> + * @n: number of entries to provide
> + *
> + * Tries to obtain @n &sk_buff entries from the NAPI percpu cache and writes
> + * the pointers into the provided array @skbs. If there are less entries
> + * available, tries to replenish the cache and bulk-allocates the diff from
> + * the MM layer if needed.
> + * The heads are being zeroed with either memset() or %__GFP_ZERO, so they are
> + * ready for {,__}build_skb_around() and don't have any data buffers attached.
> + * Must be called *only* from the BH context.
> + *
> + * Return: number of successfully allocated skbs (@n if no actual allocation
> + * needed or kmem_cache_alloc_bulk() didn't fail).
> + */
> +u32 napi_skb_cache_get_bulk(void **skbs, u32 n)
> +{
> + struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache);
> + u32 bulk, total = n;
> +
> + local_lock_nested_bh(&napi_alloc_cache.bh_lock);
> +
> + if (nc->skb_count >= n)
> + goto get;
I (mis?)understood from the commit message this condition should be
likely, too?!?
> + /* No enough cached skbs. Try refilling the cache first */
> + bulk = min(NAPI_SKB_CACHE_SIZE - nc->skb_count, NAPI_SKB_CACHE_BULK);
> + nc->skb_count += kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
> + GFP_ATOMIC | __GFP_NOWARN, bulk,
> + &nc->skb_cache[nc->skb_count]);
> + if (likely(nc->skb_count >= n))
> + goto get;
> +
> + /* Still not enough. Bulk-allocate the missing part directly, zeroed */
> + n -= kmem_cache_alloc_bulk(net_hotdata.skbuff_cache,
> + GFP_ATOMIC | __GFP_ZERO | __GFP_NOWARN,
> + n - nc->skb_count, &skbs[nc->skb_count]);
You should probably cap 'n' to NAPI_SKB_CACHE_SIZE. Also what about
latency spikes when n == 48 (should be the maximum possible with such
limit) here?
/P
Powered by blists - more mailing lists