netdev - Re: [net-next,v5,09/13] octeontx2-af: npc: cn20k: virtual index support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aXmEFX1HUKa448Ih@rkannoth-OptiPlex-7090>
Date: Wed, 28 Jan 2026 09:05:49 +0530
From: Ratheesh Kannoth <rkannoth@...vell.com>
To: Jakub Kicinski <kuba@...nel.org>
CC: <sbhatta@...vell.com>, <edumazet@...gle.com>, <andrew+netdev@...n.ch>,
        <netdev@...r.kernel.org>, <sgoutham@...vell.com>, <sumang@...vell.com>,
        <pabeni@...hat.com>, <davem@...emloft.net>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [net-next,v5,09/13] octeontx2-af: npc: cn20k: virtual index
 support

On 2026-01-28 at 07:59:28, Jakub Kicinski (kuba@...nel.org) wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
>
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> octeontx2-af: npc: cn20k: virtual index support
>
> This patch adds support for virtual MCAM index allocation and improves
> CN20K MCAM defragmentation handling.  Virtual indexes allow the hardware
> to move mapped MCAM entries internally, enabling defragmentation.
>
> > diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> > index 8b77ac053d8d..d1a927f3d5db 100644
> > --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> > +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
>
> [ ... ]
>
> > +static int npc_vidx_maps_modify(struct rvu *rvu, u16 vidx, u16 new_midx)
> > +{
> > +	u16 old_midx;
> > +	void *map;
> > +	int rc;
> > +
> > +	if (!npc_is_vidx(vidx)) {
> > +		dev_err(rvu->dev,
> > +			"%s: vidx(%u) does not map to proper mcam idx\n",
> > +			__func__, vidx);
> > +		return -ESRCH;
> > +	}
> > +
> > +	map = xa_erase(&npc_priv.xa_vidx2idx_map, vidx);
> > +	if (!map) {
> > +		dev_err(rvu->dev,
> > +			"%s: vidx(%u) could not be deleted from vidx2idx map\n",
> > +			__func__, vidx);
> > +		return -ESRCH;
> > +	}
> > +
> > +	old_midx = xa_to_value(map);
> > +
> > +	rc = xa_insert(&npc_priv.xa_vidx2idx_map, vidx,
> > +		       xa_mk_value(new_midx), GFP_KERNEL);
> > +	if (rc) {
> > +		dev_err(rvu->dev,
> > +			"%s: vidx(%u) cannot be added to vidx2idx map\n",
> > +			__func__, vidx);
> > +		return rc;
> > +	}
> > +
> > +	map = xa_erase(&npc_priv.xa_idx2vidx_map, old_midx);
> > +	if (!map) {
> > +		dev_err(rvu->dev,
> > +			"%s: old_midx(%u, vidx(%u)) cannot be added to idx2vidx map\n",
> > +			__func__, old_midx, vidx);
> > +		return -ESRCH;
>
> If this error path is taken, xa_vidx2idx_map now has vidx pointing to
> new_midx, but xa_idx2vidx_map still has old_midx pointing back to vidx.
> The bidirectional mapping becomes inconsistent.  Can this leave the maps
> in a corrupted state?
ACK.

>
> > +	}
> > +
> > +	rc = xa_insert(&npc_priv.xa_idx2vidx_map, new_midx,
> > +		       xa_mk_value(vidx), GFP_KERNEL);
> > +	if (rc) {
> > +		dev_err(rvu->dev,
> > +			"%s: new_midx(%u, vidx(%u)) cannot be added to idx2vidx map\n",
> > +			__func__, new_midx, vidx);
> > +		return rc;
>
> Similarly here, if xa_insert fails (e.g., memory pressure), the maps are
> left inconsistent: xa_vidx2idx_map has vidx -> new_midx but xa_idx2vidx_map
> has neither old_midx nor new_midx pointing to vidx.  Subsequent lookups
> via npc_idx2vidx() for new_midx would fail to find the correct vidx.
> Should these error paths restore the previous map state?
ACK.

>
> > +	}
> > +
> > +	return 0;
> > +}
>
> [ ... ]
>
> > +static int npc_defrag_process(struct rvu *rvu, struct list_head *lh)
> > +{
>
> [ ... ]
>
> > +err:
> > +	/* TODO: how to go back to old state ? */
> > +	return rc;
> > +}
>
> The TODO comment here suggests awareness that error recovery is incomplete.
> Given that npc_defrag_move_vdx_to_free() calls npc_vidx_maps_modify() which
> can fail mid-operation, should the defragmentation be made atomic or have
> proper rollback?
Whole defragmentation process is done under lock. We cannot roll back as roll back
can also fail.

The defragmentation process failure indicate a bug in index management code.

I removed the "TODO" comment and added an explantion.