[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <16E07575-7B31-4C00-B8B6-28B09B0D4A94@akamai.com>
Date: Tue, 4 Nov 2025 12:44:05 +0000
From: "Hudson, Nick" <nhudson@...mai.com>
To: Eric Dumazet <edumazet@...gle.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Other use cases for skb_attempt_defer_free [was Re:
skb_attempt_defer_free and reference counting]
Hi Eric,
Any thoughts / insights on the patch below?
It really improves performance of a vhost-net / tap setup for Rx packets.
Thanks,
Nick
> On 31 Oct 2025, at 14:02, Hudson, Nick <nhudson@...mai.com> wrote:
>
>
>
>> On 31 Oct 2025, at 11:43, Eric Dumazet <edumazet@...gle.com> wrote:
>>
>> !-------------------------------------------------------------------|
>> This Message Is From an External Sender
>> This message came from outside your organization.
>> |-------------------------------------------------------------------!
>>
>> On Fri, Oct 31, 2025 at 4:04 AM Hudson, Nick <nhudson@...mai.com> wrote:
>>>
>>> Hi,
>>>
>>> I’ve been looking at using skb_attempt_defer_free and had a question about the skb reference counting.
>>>
>>> The existing reference release for any skb handed to skb_attempt_defer_free is done in skb_defer_free_flush (via napi_consume_skb). However, it seems to me that calling skb_attempt_defer_free on the same skb to drop the multiple references is problematic as, if the defer_list isn’t serviced between the calls, the list gets corrupted. That is, the skb can’t appear on the list twice.
>>>
>>> Would it be possible to move the reference count drop into skb_attempt_defer_free and only add the skb to the list on last reference drop?
>>
>> We do not plan using this helper for arbitrary skbs, but ones fully
>> owned by TCP and UDP receive paths.
>
> Interesting.
>
> This patch has shown to give a performance benefit and I’m curious if it problematic in any way.
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index fae1a0ab36bd..59ffac9afdad 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -2251,7 +2251,7 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
> if (unlikely(ret < 0))
> kfree_skb(skb);
> else
> - consume_skb(skb);
> + skb_attempt_defer_free(skb);
> }
>
> return ret;
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index f220306731da..525b2a2698c6 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -7167,6 +7167,7 @@ nodefer: kfree_skb_napi_cache(skb);
> if (unlikely(kick))
> kick_defer_list_purge(sd, cpu);
> }
> +EXPORT_SYMBOL(skb_attempt_defer_free);
>
> static void skb_splice_csum_page(struct sk_buff *skb, struct page *page,
>
>
>
>>
>> skb_share_check() must have been called before reaching them.
>>
>> In any case using skb->next could be problematic with shared skb.
>
> OK, so the assumption is skb->users is already 1. Perhaps there is an optimisation in skb_defer_free_flush if that is the case?
Content of type "text/html" skipped
Download attachment "smime.p7s" of type "application/pkcs7-signature" (3067 bytes)
Powered by blists - more mailing lists