netdev - Re: [net,v2] net: cpsw_new: Execute ndo_set_rx

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20260127030556.3839208-1-kuba@kernel.org>
Date: Mon, 26 Jan 2026 19:05:56 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: haokexin@...il.com
Cc: Jakub Kicinski <kuba@...nel.org>,
	kuniyu@...gle.com,
	andrew+netdev@...n.ch,
	vladimir.oltean@....com,
	s-vadapalli@...com,
	linux-omap@...r.kernel.org,
	rogerq@...nel.org,
	stable@...r.kernel.org,
	pabeni@...hat.com,
	netdev@...r.kernel.org,
	edumazet@...gle.com,
	davem@...emloft.net
Subject: Re: [net,v2] net: cpsw_new: Execute ndo_set_rx_mode callback in a work queue

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net: cpsw_new: Execute ndo_set_rx_mode callback in a work queue

This patch moves the rx_mode processing from ndo_set_rx_mode into a
workqueue to address RTNL assertion warnings when vlan_for_each() is
called without RTNL held.

> diff --git a/drivers/net/ethernet/ti/cpsw_new.c b/drivers/net/ethernet/ti/cpsw_new.c
> index ab88d4c02cbd..a631df9691e0 100644
> --- a/drivers/net/ethernet/ti/cpsw_new.c
> +++ b/drivers/net/ethernet/ti/cpsw_new.c
> @@ -248,15 +248,23 @@ static int cpsw_purge_all_mc(struct net_device *ndev, const u8 *addr, int num)
>  	return 0;
>  }
>
> -static void cpsw_ndo_set_rx_mode(struct net_device *ndev)
> +static void cpsw_ndo_set_rx_mode_work(struct work_struct *work)
>  {
> -	struct cpsw_priv *priv = netdev_priv(ndev);
> +	struct cpsw_priv *priv = container_of(work, struct cpsw_priv, rx_mode_work);
>  	struct cpsw_common *cpsw = priv->cpsw;
> +	struct net_device *ndev = priv->ndev;
>
> +	if (!netif_running(ndev))
> +		return;
> +
> +	rtnl_lock();

Can this deadlock during device close? The call path would be:

  Thread A (holding RTNL): __dev_close_many() -> cpsw_ndo_stop() ->
      cancel_work_sync(&priv->rx_mode_work)

  Work thread: cpsw_ndo_set_rx_mode_work() -> rtnl_lock() [blocked]

Thread A holds RTNL and waits in cancel_work_sync() for the work to
complete, while the work thread blocks on rtnl_lock() waiting for
Thread A.

The netif_running() check happens before rtnl_lock() is acquired, so
there is a race window where the work passes the check, then
__dev_close_many() clears __LINK_STATE_START and calls ndo_stop().

The commit message mentions following the icssg-prueth driver approach,
but that driver uses rtnl_trylock() instead of rtnl_lock() in its
emac_ndo_set_rx_mode_work() function, which avoids this deadlock by
returning early if the lock cannot be acquired.

> +	netif_addr_lock_bh(ndev);

[ ... ]

> @@ -813,6 +831,8 @@ static int cpsw_ndo_stop(struct net_device *ndev)
>
>  	__hw_addr_ref_unsync_dev(&ndev->mc, ndev, cpsw_purge_all_mc);
>
> +	cancel_work_sync(&priv->rx_mode_work);

This is called with RTNL held (from __dev_close_many which has
ASSERT_RTNL). If the work function is blocked on rtnl_lock() at this
point, cancel_work_sync() will wait forever, causing a deadlock.

[ ... ]
-- 
pw-bot: cr