lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dcdf039f-4040-4a31-9738-367eda59fd04@amd.com>
Date: Tue, 23 Jul 2024 08:42:45 -0700
From: Brett Creeley <bcreeley@....com>
To: Taehee Yoo <ap420073@...il.com>, davem@...emloft.net, kuba@...nel.org,
 pabeni@...hat.com, edumazet@...gle.com, michael.chan@...adcom.com,
 netdev@...r.kernel.org
Cc: somnath.kotur@...adcom.com, dw@...idwei.uk, horms@...nel.org
Subject: Re: [PATCH net v2] bnxt_en: update xdp_rxq_info in queue restart
 logic



On 7/20/2024 10:35 PM, Taehee Yoo wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> When the netdev_rx_queue_restart() restarts queues, the bnxt_en driver
> updates(creates and deletes) a page_pool.
> But it doesn't update xdp_rxq_info, so the xdp_rxq_info is still
> connected to an old page_pool.
> So, bnxt_rx_ring_info->page_pool indicates a new page_pool, but
> bnxt_rx_ring_info->xdp_rxq is still connected to an old page_pool.
> 
> An old page_pool is no longer used so it is supposed to be
> deleted by page_pool_destroy() but it isn't.
> Because the xdp_rxq_info is holding the reference count for it and the
> xdp_rxq_info is not updated, an old page_pool will not be deleted in
> the queue restart logic.
> 
> Before restarting 1 queue:
> ./tools/net/ynl/samples/page-pool
> enp10s0f1np1[6] page pools: 4 (zombies: 0)
>          refs: 8192 bytes: 33554432 (refs: 0 bytes: 0)
>          recycling: 0.0% (alloc: 128:8048 recycle: 0:0)
> 
> After restarting 1 queue:
> ./tools/net/ynl/samples/page-pool
> enp10s0f1np1[6] page pools: 5 (zombies: 0)
>          refs: 10240 bytes: 41943040 (refs: 0 bytes: 0)
>          recycling: 20.0% (alloc: 160:10080 recycle: 1920:128)
> 
> Before restarting queues, an interface has 4 page_pools.
> After restarting one queue, an interface has 5 page_pools, but it
> should be 4, not 5.
> The reason is that queue restarting logic creates a new page_pool and
> an old page_pool is not deleted due to the absence of an update of
> xdp_rxq_info logic.
> 
> Fixes: 2d694c27d32e ("bnxt_en: implement netdev_queue_mgmt_ops")
> Signed-off-by: Taehee Yoo <ap420073@...il.com>
> ---
> 
> v2:
>   - Do not use memcpy in the bnxt_queue_start
>   - Call xdp_rxq_info_unreg() before page_pool_destroy() in the
>     bnxt_queue_mem_free().
> 
>   drivers/net/ethernet/broadcom/bnxt/bnxt.c | 17 +++++++++++++++++
>   1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index bb3be33c1bbd..ffa74c26ee53 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -4052,6 +4052,7 @@ static void bnxt_reset_rx_ring_struct(struct bnxt *bp,
> 
>          rxr->page_pool->p.napi = NULL;
>          rxr->page_pool = NULL;
> +       memset(&rxr->xdp_rxq, 0, sizeof(struct xdp_rxq_info));
> 
>          ring = &rxr->rx_ring_struct;
>          rmem = &ring->ring_mem;
> @@ -15018,6 +15019,16 @@ static int bnxt_queue_mem_alloc(struct net_device *dev, void *qmem, int idx)
>          if (rc)
>                  return rc;
> 
> +       rc = xdp_rxq_info_reg(&clone->xdp_rxq, bp->dev, idx, 0);
> +       if (rc < 0)
> +               goto err_page_pool_destroy;
> +
> +       rc = xdp_rxq_info_reg_mem_model(&clone->xdp_rxq,
> +                                       MEM_TYPE_PAGE_POOL,
> +                                       clone->page_pool);
> +       if (rc)
> +               goto err_rxq_info_unreg;
> +
>          ring = &clone->rx_ring_struct;
>          rc = bnxt_alloc_ring(bp, &ring->ring_mem);
>          if (rc)
> @@ -15047,6 +15058,9 @@ static int bnxt_queue_mem_alloc(struct net_device *dev, void *qmem, int idx)
>          bnxt_free_ring(bp, &clone->rx_agg_ring_struct.ring_mem);
>   err_free_rx_ring:
>          bnxt_free_ring(bp, &clone->rx_ring_struct.ring_mem);
> +err_rxq_info_unreg:
> +       xdp_rxq_info_unreg(&clone->xdp_rxq);

I think care needs to be taken calling xdp_rxq_info_unreg() here and 
then page_pool_destroy() below due to the xdp_rxq_info_unreg() call flow 
eventually calling page_pool_destroy(). Similar comment below.

> +err_page_pool_destroy:
>          clone->page_pool->p.napi = NULL;
>          page_pool_destroy(clone->page_pool);
>          clone->page_pool = NULL;
> @@ -15062,6 +15076,8 @@ static void bnxt_queue_mem_free(struct net_device *dev, void *qmem)
>          bnxt_free_one_rx_ring(bp, rxr);
>          bnxt_free_one_rx_agg_ring(bp, rxr);
> 
> +       xdp_rxq_info_unreg(&rxr->xdp_rxq);
> +

If the memory type is MEM_TYPE_PAGE_POOL, xdp_rxq_info_unreg() will 
eventually call page_pool_destroy(). Unless I am missing something I 
think you want to remove the call below to page_pool_destroy()?

Thanks,

Brett

>          page_pool_destroy(rxr->page_pool);
>          rxr->page_pool = NULL;
> 
> @@ -15145,6 +15161,7 @@ static int bnxt_queue_start(struct net_device *dev, void *qmem, int idx)
>          rxr->rx_sw_agg_prod = clone->rx_sw_agg_prod;
>          rxr->rx_next_cons = clone->rx_next_cons;
>          rxr->page_pool = clone->page_pool;
> +       rxr->xdp_rxq = clone->xdp_rxq;
> 
>          bnxt_copy_rx_ring(bp, rxr, clone);
> 
> --
> 2.34.1
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ