[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKhg4tJkprS+dFcpLALP_e1kpHJ-DwabOMFaXxsPx+7O0c-geQ@mail.gmail.com>
Date: Thu, 29 Jun 2023 20:17:23 +0800
From: Liang Chen <liangchen.linux@...il.com>
To: Yunsheng Lin <linyunsheng@...wei.com>
Cc: ilias.apalodimas@...aro.org, hawk@...nel.org, kuba@...nel.org,
davem@...emloft.net, edumazet@...gle.com, pabeni@...hat.com,
netdev@...r.kernel.org
Subject: Re: [PATCH net-next] skbuff: Optimize SKB coalescing for page pool case
On Thu, Jun 29, 2023 at 2:53 PM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>
> On 2023/6/28 20:11, Liang Chen wrote:
> > In order to address the issues encountered with commit 1effe8ca4e34
> > ("skbuff: fix coalescing for page_pool fragment recycling"), the
> > combination of the following condition was excluded from skb coalescing:
> >
> > from->pp_recycle = 1
> > from->cloned = 1
> > to->pp_recycle = 1
> >
> > However, with page pool environments, the aforementioned combination can
> > be quite common. In scenarios with a higher number of small packets, it
> > can significantly affect the success rate of coalescing. For example,
> > when considering packets of 256 bytes size, our comparison of coalescing
> > success rate is as follows:
>
> As skb_try_coalesce() only allow coaleascing when 'to' skb is not cloned.
>
> Could you give more detailed about the testing when we have a non-cloned
> 'to' skb and a cloned 'from' skb? As both of them should be belong to the
> same flow.
>
> I had the below patchset trying to do something similar as this patch does:
> https://lore.kernel.org/all/20211009093724.10539-5-linyunsheng@huawei.com/
>
> It seems this patch is only trying to optimize a specific case for skb
> coalescing, So if skb coalescing between non-cloned and cloned skb is a
> common case, then it might worth optimizing.
>
Sure, Thanks for the information! The testing is just a common iperf
test as below.
iperf3 -c <server IP> -i 5 -f g -t 0 -l 128
We observed the frequency of each combination of the pp (page pool)
and clone condition when entering skb_try_coalesce. The results
motivated us to propose such an optimization, as we noticed that case
11 (from pp/clone=1/1 and to pp/clone = 1/0) occurs quite often.
+-------------+--------------+--------------+--------------+--------------+
| from/to | pp/clone=0/0 | pp/clone=0/1 | pp/clone=1/0 | pp/clone=1/1 |
+-------------+--------------+--------------+--------------+--------------+
|pp/clone=0/0 | 0 | 1 | 2 | 3 |
|pp/clone=0/1 | 4 | 5 | 6 | 7 |
|pp/clone=1/0 | 8 | 9 | 10 | 11 |
|pp/clone=1/1 | 12 | 13 | 14 | 15 |
|+------------+--------------+--------------+--------------+--------------+
packet size 128:
total : 152903
0 : 0 (0%)
1 : 0 (0%)
2 : 0 (0%)
3 : 0 (0%)
4 : 0 (0%)
5 : 0 (0%)
6 : 0 (0%)
7 : 0 (0%)
8 : 0 (0%)
9 : 0 (0%)
10 : 20681 (13%)
11 : 90136 (58%)
12 : 0 (0%)
13 : 0 (0%)
14 : 0 (0%)
15 : 42086 (27%)
Thanks,
Liang
>
> >
> > Without page pool: 70%
> > With page pool: 13%
> >
>
> ...
>
> > diff --git a/include/net/page_pool.h b/include/net/page_pool.h
> > index 126f9e294389..05e5d8ead63b 100644
> > --- a/include/net/page_pool.h
> > +++ b/include/net/page_pool.h
> > @@ -399,4 +399,25 @@ static inline void page_pool_nid_changed(struct page_pool *pool, int new_nid)
> > page_pool_update_nid(pool, new_nid);
> > }
> >
> > +static inline bool page_pool_is_pp_page(struct page *page)
> > +{
> > + return (page->pp_magic & ~0x3UL) == PP_SIGNATURE;
> > +}
> > +
> > +static inline bool page_pool_is_pp_page_frag(struct page *page)> +{
> > + return !!(page->pp->p.flags & PP_FLAG_PAGE_FRAG);
> > +}
> > +
> > +static inline void page_pool_page_ref(struct page *page)
> > +{
> > + struct page *head_page = compound_head(page);
>
> It seems we could avoid adding head_page here:
> page = compound_head(page);
>
> > +
> > + if (page_pool_is_pp_page(head_page) &&
> > + page_pool_is_pp_page_frag(head_page))
> > + atomic_long_inc(&head_page->pp_frag_count);
> > + else
> > + get_page(head_page);
>
> page_ref_inc() should be enough here instead of get_page()
> as compound_head() have been called.
>
> > +}
> > +
> > #endif /* _NET_PAGE_POOL_H */
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 6c5915efbc17..9806b091f0f6 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -5666,8 +5666,7 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
> > * !@to->pp_recycle but its tricky (due to potential race with
> > * the clone disappearing) and rare, so not worth dealing with.
> > */
> > - if (to->pp_recycle != from->pp_recycle ||
> > - (from->pp_recycle && skb_cloned(from)))
> > + if (to->pp_recycle != from->pp_recycle)
> > return false;
> >
> > if (len <= skb_tailroom(to)) {
> > @@ -5724,8 +5723,12 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
> > /* if the skb is not cloned this does nothing
> > * since we set nr_frags to 0.
> > */
> > - for (i = 0; i < from_shinfo->nr_frags; i++)
> > - __skb_frag_ref(&from_shinfo->frags[i]);
> > + if (from->pp_recycle)
> > + for (i = 0; i < from_shinfo->nr_frags; i++)
> > + page_pool_page_ref(skb_frag_page(&from_shinfo->frags[i]));
> > + else
> > + for (i = 0; i < from_shinfo->nr_frags; i++)
> > + __skb_frag_ref(&from_shinfo->frags[i]);
> >
> > to->truesize += delta;
> > to->len += len;
> >
Powered by blists - more mailing lists