[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191025153353.606e4b0d@carbon>
Date: Fri, 25 Oct 2019 15:33:53 +0200
From: Jesper Dangaard Brouer <brouer@...hat.com>
To: Saeed Mahameed <saeedm@...lanox.com>
Cc: "David S. Miller" <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Jonathan Lemon <jonathan.lemon@...il.com>,
"ilias.apalodimas@...aro.org" <ilias.apalodimas@...aro.org>,
brouer@...hat.com
Subject: Re: [PATCH net-nex V2 2/3] page_pool: Don't recycle non-reusable
pages
On Wed, 23 Oct 2019 19:37:00 +0000
Saeed Mahameed <saeedm@...lanox.com> wrote:
> A page is NOT reusable when at least one of the following is true:
> 1) allocated when system was under some pressure. (page_is_pfmemalloc)
> 2) belongs to a different NUMA node than pool->p.nid.
>
> To update pool->p.nid users should call page_pool_update_nid().
>
> Holding on to such pages in the pool will hurt the consumer performance
> when the pool migrates to a different numa node.
>
> Performance testing:
> XDP drop/tx rate and TCP single/multi stream, on mlx5 driver
> while migrating rx ring irq from close to far numa:
>
> mlx5 internal page cache was locally disabled to get pure page pool
> results.
Could you show us the code that disable the local page cache?
> CPU: Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz
> NIC: Mellanox Technologies MT27700 Family [ConnectX-4] (100G)
>
> XDP Drop/TX single core:
> NUMA | XDP | Before | After
> ---------------------------------------
> Close | Drop | 11 Mpps | 10.9 Mpps
> Far | Drop | 4.4 Mpps | 5.8 Mpps
>
> Close | TX | 6.5 Mpps | 6.5 Mpps
> Far | TX | 3.5 Mpps | 4 Mpps
>
> Improvement is about 30% drop packet rate, 15% tx packet rate for numa
> far test.
> No degradation for numa close tests.
>
> TCP single/multi cpu/stream:
> NUMA | #cpu | Before | After
> --------------------------------------
> Close | 1 | 18 Gbps | 18 Gbps
> Far | 1 | 15 Gbps | 18 Gbps
> Close | 12 | 80 Gbps | 80 Gbps
> Far | 12 | 68 Gbps | 80 Gbps
>
> In all test cases we see improvement for the far numa case, and no
> impact on the close numa case.
>
> The impact of adding a check per page is very negligible, and shows no
> performance degradation whatsoever, also functionality wise it seems more
> correct and more robust for page pool to verify when pages should be
> recycled, since page pool can't guarantee where pages are coming from.
>
> Signed-off-by: Saeed Mahameed <saeedm@...lanox.com>
> Acked-by: Jonathan Lemon <jonathan.lemon@...il.com>
> ---
> net/core/page_pool.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 953af6d414fb..73e4173c4dce 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -283,6 +283,17 @@ static bool __page_pool_recycle_direct(struct page *page,
> return true;
> }
>
> +/* page is NOT reusable when:
> + * 1) allocated when system is under some pressure. (page_is_pfmemalloc)
> + * 2) belongs to a different NUMA node than pool->p.nid.
> + *
> + * To update pool->p.nid users must call page_pool_update_nid.
> + */
> +static bool pool_page_reusable(struct page_pool *pool, struct page *page)
> +{
> + return !page_is_pfmemalloc(page) && page_to_nid(page) == pool->p.nid;
> +}
> +
> void __page_pool_put_page(struct page_pool *pool,
> struct page *page, bool allow_direct)
> {
> @@ -292,7 +303,8 @@ void __page_pool_put_page(struct page_pool *pool,
> *
> * refcnt == 1 means page_pool owns page, and can recycle it.
> */
> - if (likely(page_ref_count(page) == 1)) {
> + if (likely(page_ref_count(page) == 1 &&
> + pool_page_reusable(pool, page))) {
I'm afraid that we are slowly chipping away the performance benefit
with these incremental changes, adding more checks. We have an extreme
performance use-case with XDP_DROP, where we want drivers to use this
code path to hit __page_pool_recycle_direct(), that is a simple array
update (protected under NAPI) into pool->alloc.cache[].
To preserve this hot-path, you could instead flush pool->alloc.cache[]
in the call page_pool_update_nid(). And move the pool_page_reusable()
check into __page_pool_recycle_into_ring(). (Below added the '>>' with
remaining code to make this easier to see)
> /* Read barrier done in page_ref_count / READ_ONCE */
>
> if (allow_direct && in_serving_softirq())
>> if (__page_pool_recycle_direct(page, pool))
>> return;
>>
>> if (!__page_pool_recycle_into_ring(pool, page)) {
>> /* Cache full, fallback to free pages */
>> __page_pool_return_page(pool, page);
>> }
>> return;
>> }
>> /* Fallback/non-XDP mode: API user have elevated refcnt.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
For easier review:
/* Only allow direct recycling in special circumstances, into the
* alloc side cache. E.g. during RX-NAPI processing for XDP_DROP use-case.
*
* Caller must provide appropriate safe context.
*/
static bool __page_pool_recycle_direct(struct page *page,
struct page_pool *pool)
{
if (unlikely(pool->alloc.count == PP_ALLOC_CACHE_SIZE))
return false;
/* Caller MUST have verified/know (page_ref_count(page) == 1) */
pool->alloc.cache[pool->alloc.count++] = page;
return true;
}
Powered by blists - more mailing lists