[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260127030556.3839208-1-kuba@kernel.org>
Date: Mon, 26 Jan 2026 19:05:56 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: haokexin@...il.com
Cc: Jakub Kicinski <kuba@...nel.org>,
kuniyu@...gle.com,
andrew+netdev@...n.ch,
vladimir.oltean@....com,
s-vadapalli@...com,
linux-omap@...r.kernel.org,
rogerq@...nel.org,
stable@...r.kernel.org,
pabeni@...hat.com,
netdev@...r.kernel.org,
edumazet@...gle.com,
davem@...emloft.net
Subject: Re: [net,v2] net: cpsw_new: Execute ndo_set_rx_mode callback in a work queue
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net: cpsw_new: Execute ndo_set_rx_mode callback in a work queue
This patch moves the rx_mode processing from ndo_set_rx_mode into a
workqueue to address RTNL assertion warnings when vlan_for_each() is
called without RTNL held.
> diff --git a/drivers/net/ethernet/ti/cpsw_new.c b/drivers/net/ethernet/ti/cpsw_new.c
> index ab88d4c02cbd..a631df9691e0 100644
> --- a/drivers/net/ethernet/ti/cpsw_new.c
> +++ b/drivers/net/ethernet/ti/cpsw_new.c
> @@ -248,15 +248,23 @@ static int cpsw_purge_all_mc(struct net_device *ndev, const u8 *addr, int num)
> return 0;
> }
>
> -static void cpsw_ndo_set_rx_mode(struct net_device *ndev)
> +static void cpsw_ndo_set_rx_mode_work(struct work_struct *work)
> {
> - struct cpsw_priv *priv = netdev_priv(ndev);
> + struct cpsw_priv *priv = container_of(work, struct cpsw_priv, rx_mode_work);
> struct cpsw_common *cpsw = priv->cpsw;
> + struct net_device *ndev = priv->ndev;
>
> + if (!netif_running(ndev))
> + return;
> +
> + rtnl_lock();
Can this deadlock during device close? The call path would be:
Thread A (holding RTNL): __dev_close_many() -> cpsw_ndo_stop() ->
cancel_work_sync(&priv->rx_mode_work)
Work thread: cpsw_ndo_set_rx_mode_work() -> rtnl_lock() [blocked]
Thread A holds RTNL and waits in cancel_work_sync() for the work to
complete, while the work thread blocks on rtnl_lock() waiting for
Thread A.
The netif_running() check happens before rtnl_lock() is acquired, so
there is a race window where the work passes the check, then
__dev_close_many() clears __LINK_STATE_START and calls ndo_stop().
The commit message mentions following the icssg-prueth driver approach,
but that driver uses rtnl_trylock() instead of rtnl_lock() in its
emac_ndo_set_rx_mode_work() function, which avoids this deadlock by
returning early if the lock cannot be acquired.
> + netif_addr_lock_bh(ndev);
[ ... ]
> @@ -813,6 +831,8 @@ static int cpsw_ndo_stop(struct net_device *ndev)
>
> __hw_addr_ref_unsync_dev(&ndev->mc, ndev, cpsw_purge_all_mc);
>
> + cancel_work_sync(&priv->rx_mode_work);
This is called with RTNL held (from __dev_close_many which has
ASSERT_RTNL). If the work function is blocked on rtnl_lock() at this
point, cancel_work_sync() will wait forever, causing a deadlock.
[ ... ]
--
pw-bot: cr
Powered by blists - more mailing lists