netdev - Re: [PATCH net v2] octeontx2-pf: Fix page pool cache index corruption.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230907150823.GF434333@kernel.org>
Date: Thu, 7 Sep 2023 17:08:23 +0200
From: Simon Horman <horms@...nel.org>
To: Ratheesh Kannoth <rkannoth@...vell.com>
Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
	sgoutham@...vell.com, gakula@...vell.com, sbhatta@...vell.com,
	hkelam@...vell.com, davem@...emloft.net, edumazet@...gle.com,
	kuba@...nel.org, pabeni@...hat.com, hawk@...nel.org,
	alexander.duyck@...il.com, ilias.apalodimas@...aro.org,
	linyunsheng@...wei.com, bigeasy@...utronix.de
Subject: Re: [PATCH net v2] octeontx2-pf: Fix page pool cache index
 corruption.

On Thu, Sep 07, 2023 at 07:17:11AM +0530, Ratheesh Kannoth wrote:
> The access to page pool `cache' array and the `count' variable
> is not locked. Page pool cache access is fine as long as there
> is only one consumer per pool.
> 
> octeontx2 driver fills in rx buffers from page pool in NAPI context.
> If system is stressed and could not allocate buffers, refiiling work
> will be delegated to a delayed workqueue. This means that there are
> two cosumers to the page pool cache.
> 
> Either workqueue or IRQ/NAPI can be run on other CPU. This will lead
> to lock less access, hence corruption of cache pool indexes.
> 
> To fix this issue, NAPI is rescheduled from workqueue context to refill
> rx buffers.
> 
> Fixes: b2e3406a38f0 ("octeontx2-pf: Add support for page pool")
> Signed-off-by: Ratheesh Kannoth <rkannoth@...vell.com>
> 
> ---
> ChangeLogs
> v1 -> v2: Added local_bh_disable() to be precise on napi scheduling.
> 
> v0 -> v1: udelay will waste CPU cycles in UP. So call napi_schedule from
> 	  delayed work queue context.
> 
> ---
> ---
>  .../ethernet/marvell/octeontx2/nic/cn10k.c    |  4 +-
>  .../ethernet/marvell/octeontx2/nic/cn10k.h    |  2 +-
>  .../marvell/octeontx2/nic/otx2_common.c       | 43 +++----------------
>  .../marvell/octeontx2/nic/otx2_common.h       |  3 +-
>  .../ethernet/marvell/octeontx2/nic/otx2_pf.c  |  7 +--
>  .../marvell/octeontx2/nic/otx2_txrx.c         | 29 ++++++++++---
>  .../marvell/octeontx2/nic/otx2_txrx.h         |  4 +-
>  7 files changed, 42 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k.c b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k.c
> index 826f691de259..211c7d8a0556 100644
> --- a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k.c
> @@ -107,12 +107,13 @@ int cn10k_sq_aq_init(void *dev, u16 qidx, u16 sqb_aura)
>  }
>  
>  #define NPA_MAX_BURST 16
> -void cn10k_refill_pool_ptrs(void *dev, struct otx2_cq_queue *cq)
> +int cn10k_refill_pool_ptrs(void *dev, struct otx2_cq_queue *cq)
>  {
>  	struct otx2_nic *pfvf = dev;
>  	u64 ptrs[NPA_MAX_BURST];
>  	int num_ptrs = 1;
>  	dma_addr_t bufptr;
> +	int cnt = cq->pool_ptrs;

nit: please arrange local variables in new Networking code in reverse xmas
     tree order - longest line to shortest.

>  
>  	/* Refill pool with new buffers */
>  	while (cq->pool_ptrs) {

...