netdev - Re: [net-next,v8,09/13] octeontx2-af: npc: cn20k: virtual index support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260205022525.2271130-1-kuba@kernel.org>
Date: Wed,  4 Feb 2026 18:25:25 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: rkannoth@...vell.com
Cc: Jakub Kicinski <kuba@...nel.org>,
	edumazet@...gle.com,
	sgoutham@...vell.com,
	andrew+netdev@...n.ch,
	pabeni@...hat.com,
	netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	sumang@...vell.com,
	sbhatta@...vell.com,
	davem@...emloft.net
Subject: Re: [net-next,v8,09/13] octeontx2-af: npc: cn20k: virtual index support

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
octeontx2-af: npc: cn20k: virtual index support

This patch adds support for virtual MCAM index allocation and improves
CN20K MCAM defragmentation handling. Virtual indexes allow the hardware
to move mapped MCAM entries internally, enabling defragmentation and
preventing scattered allocations across subbanks.

> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c

[ ... ]

> +static int npc_vidx_maps_modify(struct rvu *rvu, u16 vidx, u16 new_midx)
> +{
> +	u16 old_midx;
> +	void *map;
> +	int rc;
> +

[ ... ]

> +fail3:
> +	if (xa_insert(&npc_priv.xa_idx2vidx_map, old_midx,
> +		      xa_mk_value(vidx), GFP_KERNEL))
> +		dev_err(rvu->dev,
> +			"%s: Error to roll back idx2vidx old_midx=%u vidx=%u\n",
> +			__func__, old_midx, vidx);
> +fail2:
> +	if (xa_erase(&npc_priv.xa_vidx2idx_map, vidx))
> +		dev_err(rvu->dev,
> +			"%s: Failed to roll back vidx2idx vidx=%u\n",
> +			__func__, vidx);

At fail2, xa_erase() returns the erased entry on success (non-NULL) and NULL
on failure. The error message is printed when xa_erase() succeeds, which
appears inverted from the intended behavior.

Should the condition check for NULL to detect failure instead?

> +
> +fail1:
> +	if (xa_insert(&npc_priv.xa_vidx2idx_map, vidx,
> +		      xa_mk_value(old_midx), GFP_KERNEL))
> +		dev_err(rvu->dev,
> +			"%s: Faled to roll back vidx2idx to old_midx=%u, vidx=%u\n",
> +			__func__, old_midx, vidx);

This isn't a bug, but there's a typo: "Faled" should be "Failed".

Also, at fail1, the rollback attempts to insert into xa_vidx2idx_map to
restore old_midx. However, looking at the control flow, fail1 is reached when
xa_insert() to xa_vidx2idx_map fails. At that point, xa_vidx2idx_map has
already been erased, so the rollback target is correct.

But fail2 rolls back using xa_erase() on xa_vidx2idx_map, which was just
inserted successfully. The rollback at fail1 should restore xa_idx2vidx_map
instead, since that's the only map modified when fail1 is reached via direct
return.

Could the error path rollback logic be reviewed?

> +
> +	return rc;
> +}

[ ... ]

> +static int npc_vidx_maps_del_entry(struct rvu *rvu, u16 vidx, u16 *old_midx)
> +{
> +	u16 mcam_idx;
> +	void *map;
> +
> +	if (!npc_is_vidx(vidx)) {
> +		dev_err(rvu->dev,
> +			"%s: vidx(%u) does not map to proper mcam idx\n",
> +			__func__, vidx);

The format string says "mcam idx" but the code passes vidx. Should this
say "vidx" for consistency?

[ ... ]

> +static
> +int npc_defrag_move_vdx_to_free(struct rvu *rvu,
> +				struct npc_defrag_node *f,
> +				struct npc_defrag_node *v,
> +				int cnt, u16 *save)
> +{
> +	struct npc_mcam *mcam = &rvu->hw->mcam;

[ ... ]

> +		/* Remove from mcam maps */
> +		mcam->entry2pfvf_map[old_midx] = NPC_MCAM_INVALID_MAP;
> +		mcam->entry2cntr_map[old_midx] = NPC_MCAM_INVALID_MAP;
> +		npc_mcam_clear_bit(mcam, old_midx);
> +
> +		mcam->entry2pfvf_map[new_midx] = pcifunc;
> +		mcam->entry2cntr_map[new_midx] = new_midx;

Should entry2cntr_map[new_midx] be assigned the counter index from the
old entry rather than new_midx? The entry2cntr_map is used to track which
counter is mapped to an entry, and during defragmentation the counter
association should be preserved from entry2cntr_map[old_midx].

[ ... ]

> @@ -2214,14 +2444,15 @@ static int npc_idx_free(struct rvu *rvu, u16 *mcam_idx, int count,
> 			bool maps_del)
>  {
>  	struct npc_subbank *sb;
> -	int idx, i;
> +	u16 vidx, midx;
> +	int sb_off, i;
>  	bool ret;
>  	int rc;
>
>  	/* Check if we can dealloc indexes properly ? */
>  	for (i = 0; i < count; i++) {
>  		rc =  npc_mcam_idx_2_subbank_idx(rvu, mcam_idx[i],
> -						 &sb, &idx);
> +						 &sb, &sb_off);
>  		if (rc) {
>  			dev_err(rvu->dev,
>  				"Failed to free mcam idx=%u\n", mcam_idx[i]);

The first validation loop passes mcam_idx[i] directly to
npc_mcam_idx_2_subbank_idx() without converting virtual indexes first.
Virtual indexes are defined as vidx >= bank_depth * 2, but
npc_mcam_idx_2_subbank_idx() returns -EINVAL for any index >= bank_depth * 2:

npc_mcam_idx_2_key_type():
    if (mcam_idx >= npc_priv.bank_depth * 2) {
        dev_err(rvu->dev, "%s: bad params\n", __func__);
        return -EINVAL;
    }

This means all virtual index free operations will fail at the first validation
loop, even though the second loop has correct vidx handling code.

Should the first validation loop convert virtual indexes using
npc_cn20k_vidx2idx() before calling npc_mcam_idx_2_subbank_idx()?

> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
> --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c

[ ... ]

> +static int rvu_af_npc_defrag(struct devlink *devlink, u32 id,
> +			     struct devlink_param_gset_ctx *ctx,
> +			     struct netlink_ext_ack *extack)
> +{
> +	struct rvu_devlink *rvu_dl = devlink_priv(devlink);
> +	struct rvu *rvu = rvu_dl->rvu;
> +
> +	npc_cn20k_defrag(rvu);
> +
> +	return 0;
> +}

The return value of npc_cn20k_defrag() is discarded. npc_cn20k_defrag() can
fail with -ENOMEM when kcalloc() fails for node or vidx arrays, or with errors
propagated from npc_defrag_process().

Should this be "return npc_cn20k_defrag(rvu);" instead?

> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
> --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c

[ ... ]

> @@ -3256,7 +3266,7 @@
>  	/* If shift has failed then report the failed index */
>  	if (index != req->shift_count) {
>  		rc = NPC_MCAM_PERM_DENIED;
> -		rsp->failed_entry_idx = index;
> +		rsp->failed_entry_idx = npc_cn20k_idx2vidx(index);
>  	}

The variable "index" here is a loop counter (0 to shift_count-1), not an MCAM
entry index. The failed_entry_idx field indicates which position in the request
arrays failed.

Should this remain as "rsp->failed_entry_idx = index;" without the conversion?