netdev - Re: [PATCH net-next] skbuff: Optimize SKB coalescing for page pool case

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5b81338a-f784-d73e-170c-d12af38692cb@huawei.com>
Date: Thu, 29 Jun 2023 14:53:18 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Liang Chen <liangchen.linux@...il.com>, <ilias.apalodimas@...aro.org>,
	<hawk@...nel.org>
CC: <kuba@...nel.org>, <davem@...emloft.net>, <edumazet@...gle.com>,
	<pabeni@...hat.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next] skbuff: Optimize SKB coalescing for page pool
 case

On 2023/6/28 20:11, Liang Chen wrote:
> In order to address the issues encountered with commit 1effe8ca4e34
> ("skbuff: fix coalescing for page_pool fragment recycling"), the
> combination of the following condition was excluded from skb coalescing:
> 
> from->pp_recycle = 1
> from->cloned = 1
> to->pp_recycle = 1
> 
> However, with page pool environments, the aforementioned combination can
> be quite common. In scenarios with a higher number of small packets, it
> can significantly affect the success rate of coalescing. For example,
> when considering packets of 256 bytes size, our comparison of coalescing
> success rate is as follows:

As skb_try_coalesce() only allow coaleascing when 'to' skb is not cloned.

Could you give more detailed about the testing when we have a non-cloned
'to' skb and a cloned 'from' skb? As both of them should be belong to the
same flow.

I had the below patchset trying to do something similar as this patch does:
https://lore.kernel.org/all/20211009093724.10539-5-linyunsheng@huawei.com/

It seems this patch is only trying to optimize a specific case for skb
coalescing, So if skb coalescing between non-cloned and cloned skb is a
common case, then it might worth optimizing.


> 
> Without page pool: 70%
> With page pool: 13%
> 

...

> diff --git a/include/net/page_pool.h b/include/net/page_pool.h
> index 126f9e294389..05e5d8ead63b 100644
> --- a/include/net/page_pool.h
> +++ b/include/net/page_pool.h
> @@ -399,4 +399,25 @@ static inline void page_pool_nid_changed(struct page_pool *pool, int new_nid)
>  		page_pool_update_nid(pool, new_nid);
>  }
>  
> +static inline bool page_pool_is_pp_page(struct page *page)
> +{
> +	return (page->pp_magic & ~0x3UL) == PP_SIGNATURE;
> +}
> +
> +static inline bool page_pool_is_pp_page_frag(struct page *page)> +{
> +	return !!(page->pp->p.flags & PP_FLAG_PAGE_FRAG);
> +}
> +
> +static inline void page_pool_page_ref(struct page *page)
> +{
> +	struct page *head_page = compound_head(page);

It seems we could avoid adding head_page here:
page = compound_head(page);

> +
> +	if (page_pool_is_pp_page(head_page) &&
> +			page_pool_is_pp_page_frag(head_page))
> +		atomic_long_inc(&head_page->pp_frag_count);
> +	else
> +		get_page(head_page);

page_ref_inc() should be enough here instead of get_page()
as compound_head() have been called.

> +}
> +
>  #endif /* _NET_PAGE_POOL_H */
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 6c5915efbc17..9806b091f0f6 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -5666,8 +5666,7 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
>  	 * !@to->pp_recycle but its tricky (due to potential race with
>  	 * the clone disappearing) and rare, so not worth dealing with.
>  	 */
> -	if (to->pp_recycle != from->pp_recycle ||
> -	    (from->pp_recycle && skb_cloned(from)))
> +	if (to->pp_recycle != from->pp_recycle)
>  		return false;
>  
>  	if (len <= skb_tailroom(to)) {
> @@ -5724,8 +5723,12 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
>  	/* if the skb is not cloned this does nothing
>  	 * since we set nr_frags to 0.
>  	 */
> -	for (i = 0; i < from_shinfo->nr_frags; i++)
> -		__skb_frag_ref(&from_shinfo->frags[i]);
> +	if (from->pp_recycle)
> +		for (i = 0; i < from_shinfo->nr_frags; i++)
> +			page_pool_page_ref(skb_frag_page(&from_shinfo->frags[i]));
> +	else
> +		for (i = 0; i < from_shinfo->nr_frags; i++)
> +			__skb_frag_ref(&from_shinfo->frags[i]);
>  
>  	to->truesize += delta;
>  	to->len += len;
>