[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <64caa039-14fb-c883-de1c-6549b5314269@redhat.com>
Date: Mon, 27 Jun 2022 11:50:44 +0200
From: Jesper Dangaard Brouer <jbrouer@...hat.com>
To: Guangbin Huang <huangguangbin2@...wei.com>, hawk@...nel.org,
ilias.apalodimas@...aro.org, davem@...emloft.net, kuba@...nel.org,
edumazet@...gle.com, pabeni@...hat.com
Cc: brouer@...hat.com, lorenzo@...nel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
lipeng321@...wei.com, chenhao288@...ilicon.com
Subject: Re: [PATCH net-next] net: page_pool: optimize page pool page
allocation in NUMA scenario
On 24/06/2022 11.36, Guangbin Huang wrote:
> From: Jie Wang <wangjie125@...wei.com>
>
> Currently NIC packet receiving performance based on page pool deteriorates
> occasionally. To analysis the causes of this problem page allocation stats
> are collected. Here are the stats when NIC rx performance deteriorates:
>
> bandwidth(Gbits/s) 16.8 6.91
> rx_pp_alloc_fast 13794308 21141869
> rx_pp_alloc_slow 108625 166481
> rx_pp_alloc_slow_h 0 0
> rx_pp_alloc_empty 8192 8192
> rx_pp_alloc_refill 0 0
> rx_pp_alloc_waive 100433 158289
> rx_pp_recycle_cached 0 0
> rx_pp_recycle_cache_full 0 0
> rx_pp_recycle_ring 362400 420281
> rx_pp_recycle_ring_full 6064893 9709724
> rx_pp_recycle_released_ref 0 0
>
> The rx_pp_alloc_waive count indicates that a large number of pages' numa
> node are inconsistent with the NIC device numa node. Therefore these pages
> can't be reused by the page pool. As a result, many new pages would be
> allocated by __page_pool_alloc_pages_slow which is time consuming. This
> causes the NIC rx performance fluctuations.
>
> The main reason of huge numa mismatch pages in page pool is that page pool
> uses alloc_pages_bulk_array to allocate original pages. This function is
> not suitable for page allocation in NUMA scenario. So this patch uses
> alloc_pages_bulk_array_node which has a NUMA id input parameter to ensure
> the NUMA consistent between NIC device and allocated pages.
>
> Repeated NIC rx performance tests are performed 40 times. NIC rx bandwidth
> is higher and more stable compared to the datas above. Here are three test
> stats, the rx_pp_alloc_waive count is zero and rx_pp_alloc_slow which
> indicates pages allocated from slow patch is relatively low.
>
> bandwidth(Gbits/s) 93 93.9 93.8
> rx_pp_alloc_fast 60066264 61266386 60938254
> rx_pp_alloc_slow 16512 16517 16539
> rx_pp_alloc_slow_ho 0 0 0
> rx_pp_alloc_empty 16512 16517 16539
> rx_pp_alloc_refill 473841 481910 481585
> rx_pp_alloc_waive 0 0 0
> rx_pp_recycle_cached 0 0 0
> rx_pp_recycle_cache_full 0 0 0
> rx_pp_recycle_ring 29754145 30358243 30194023
> rx_pp_recycle_ring_full 0 0 0
> rx_pp_recycle_released_ref 0 0 0
>
> Signed-off-by: Jie Wang <wangjie125@...wei.com>
> ---
> net/core/page_pool.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
Thanks for improving this, but we need some small adjustments below.
And then you need to send a V2 of the patch.
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index f18e6e771993..15997fcd78f3 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -377,6 +377,7 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
> unsigned int pp_order = pool->p.order;
> struct page *page;
> int i, nr_pages;
> + int pref_nid; /* preferred NUMA node */
>
> /* Don't support bulk alloc for high-order pages */
> if (unlikely(pp_order))
> @@ -386,10 +387,18 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
> if (unlikely(pool->alloc.count > 0))
> return pool->alloc.cache[--pool->alloc.count];
>
> +#ifdef CONFIG_NUMA
> + pref_nid = (pool->p.nid == NUMA_NO_NODE) ? numa_mem_id() : pool->p.nid;
> +#else
> + /* Ignore pool->p.nid setting if !CONFIG_NUMA, helps compiler */
Remove "helps compiler" from comments, it only make sense in the code
this was copy-pasted from.
> + pref_nid = numa_mem_id(); /* will be zero like page_to_nid() */
The comment about "page_to_nid()" is only relevant in the code
this was copy-pasted from.
Change to:
pref_nid = NUMA_NO_NODE;
As alloc_pages_bulk_array_node() will be inlined, the effect (generated
asm code) will be the same, but it will be better for code maintenance.
> +#endif
> +
> /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */
> memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);
>
> - nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache);
> + nr_pages = alloc_pages_bulk_array_node(gfp, pref_nid, bulk,
> + pool->alloc.cache);
> if (unlikely(!nr_pages))
> return NULL;
>
Powered by blists - more mailing lists