[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <319497a698ba77244aa935c13dc9b93c893dbbc3.camel@redhat.com>
Date: Fri, 22 Apr 2022 11:02:27 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Eric Dumazet <eric.dumazet@...il.com>,
"David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>
Cc: netdev <netdev@...r.kernel.org>, Eric Dumazet <edumazet@...gle.com>
Subject: Re: [PATCH net-next] net: generalize skb freeing deferral to
per-cpu lists
Hi,
Looks great! I have a few questions below mostly to understand better
how it works...
On Thu, 2022-04-21 at 08:39 -0700, Eric Dumazet wrote:
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 84d78df60453955a8eaf05847f6e2145176a727a..2fe311447fae5e860eee95f6e8772926d4915e9f 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -1080,6 +1080,7 @@ struct sk_buff {
> unsigned int sender_cpu;
> };
> #endif
> + u16 alloc_cpu;
I *think* we could in theory fetch the CPU that allocated the skb from
the napi_id - adding a cpu field to napi_struct and implementing an
helper to fetch it. Have you considered that option? or the napi lookup
would be just too expensive?
[...]
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 4a77ebda4fb155581a5f761a864446a046987f51..4136d9c0ada6870ea0f7689702bdb5f0bbf29145 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4545,6 +4545,12 @@ static void rps_trigger_softirq(void *data)
>
> #endif /* CONFIG_RPS */
>
> +/* Called from hardirq (IPI) context */
> +static void trigger_rx_softirq(void *data)
Perhaps '__always_unused' ? (But the compiler doesn't complain here)
> @@ -6486,3 +6487,46 @@ void __skb_ext_put(struct skb_ext *ext)
> }
> EXPORT_SYMBOL(__skb_ext_put);
> #endif /* CONFIG_SKB_EXTENSIONS */
> +
> +/**
> + * skb_attempt_defer_free - queue skb for remote freeing
> + * @skb: buffer
> + *
> + * Put @skb in a per-cpu list, using the cpu which
> + * allocated the skb/pages to reduce false sharing
> + * and memory zone spinlock contention.
> + */
> +void skb_attempt_defer_free(struct sk_buff *skb)
> +{
> + int cpu = skb->alloc_cpu;
> + struct softnet_data *sd;
> + unsigned long flags;
> + bool kick;
> +
> + if (WARN_ON_ONCE(cpu >= nr_cpu_ids) || !cpu_online(cpu)) {
> + __kfree_skb(skb);
> + return;
> + }
I'm wondering if we should skip even when cpu == smp_processor_id()?
> +
> + sd = &per_cpu(softnet_data, cpu);
> + /* We do not send an IPI or any signal.
> + * Remote cpu will eventually call skb_defer_free_flush()
> + */
> + spin_lock_irqsave(&sd->skb_defer_list.lock, flags);
> + __skb_queue_tail(&sd->skb_defer_list, skb);
> +
> + /* kick every time queue length reaches 128.
> + * This should avoid blocking in smp_call_function_single_async().
> + * This condition should hardly be bit under normal conditions,
> + * unless cpu suddenly stopped to receive NIC interrupts.
> + */
> + kick = skb_queue_len(&sd->skb_defer_list) == 128;
Out of sheer curiosity why 128? I guess it's should be larger then
NAPI_POLL_WEIGHT, to cope with with maximum theorethical burst len?
Thanks!
Paolo
Powered by blists - more mailing lists