[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250308203835.60633-2-enjuk@amazon.com>
Date: Sun, 9 Mar 2025 05:37:18 +0900
From: Kohei Enju <enjuk@...zon.com>
To: <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<bpf@...r.kernel.org>
CC: "David S . Miller" <davem@...emloft.net>, Eric Dumazet
<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
<pabeni@...hat.com>, Simon Horman <horms@...nel.org>, Kuniyuki Iwashima
<kuniyu@...zon.com>, Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
"Ahmed Zaki" <ahmed.zaki@...el.com>, Stanislav Fomichev <sdf@...ichev.me>,
"Alexander Lobakin" <aleksander.lobakin@...el.com>, Kohei Enju
<kohei.enju@...il.com>, Kohei Enju <enjuk@...zon.com>
Subject: [PATCH net-next v1] dev: remove netdev_lock() and netdev_lock_ops() in register_netdevice().
ieee80211_register_hw() takes the wiphy lock, and then
register_netdevice() calls netdev_lock(), since commit 5fda3f35349b ("net:
make netdev_lock() protect netdev->reg_state").
On the other hand, do_setlink() calls netdev_lock() and then
ieee80211_change_mac() takes the wiphy lock, after commit df43d8bf1031
("net: replace dev_addr_sem with netdev instance lock").
This causes the circular locking dependency.
Like netdev_lock(), netdev_lock_ops() introduced in commit 97246d6d21c2
("net: hold netdev instance lock during ndo_bpf") would also cause the
same type of issue.
Both netdev_lock() and netdev_lock_ops() are called before
list_netdevice() in register_netdevice().
No other context can access the struct net_device, so we don't need these
locks in this context.
Remove them.
WARNING: possible circular locking dependency detected
6.14.0-rc5-next-20250307 #44 Not tainted
NetworkManager/8289 is trying to acquire lock:
ffff88810fb30768 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: ieee80211_change_mac+0x110/0x1450
but task is already holding lock:
ffff88810fb48d30 (&dev->lock){+.+.}-{4:4}, at: do_setlink.isra.0+0x326c/0x3f40
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&dev->lock){+.+.}-{4:4}:
lock_acquire+0x1b2/0x550
__mutex_lock+0x1a2/0x14b0
register_netdevice+0x12dc/0x2000
cfg80211_register_netdevice+0x149/0x330
ieee80211_if_add+0xcfe/0x1880
ieee80211_register_hw+0x3655/0x3f60
mac80211_hwsim_new_radio+0x2760/0x53a0
init_mac80211_hwsim+0x6c6/0x7d0
do_one_initcall+0x11a/0x6d0
kernel_init_freeable+0x6dd/0x770
kernel_init+0x1f/0x1e0
ret_from_fork+0x45/0x80
ret_from_fork_asm+0x1a/0x30
-> #0 (&rdev->wiphy.mtx){+.+.}-{4:4}:
check_prev_add+0x1af/0x23e0
__lock_acquire+0x2d55/0x49a0
lock_acquire+0x1b2/0x550
__mutex_lock+0x1a2/0x14b0
ieee80211_change_mac+0x110/0x1450
netif_set_mac_address+0x30a/0x4a0
do_setlink.isra.0+0x77a/0x3f40
rtnl_newlink+0x11ef/0x2370
rtnetlink_rcv_msg+0x95b/0xe90
netlink_rcv_skb+0x16b/0x440
netlink_unicast+0x532/0x7f0
netlink_sendmsg+0x8ca/0xda0
____sys_sendmsg+0x9cf/0xb60
___sys_sendmsg+0x11a/0x1a0
__sys_sendmsg+0x136/0x1e0
do_syscall_64+0x74/0x190
entry_SYSCALL_64_after_hwframe+0x76/0x7e
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&dev->lock);
lock(&rdev->wiphy.mtx);
lock(&dev->lock);
lock(&rdev->wiphy.mtx);
*** DEADLOCK ***
Fixes: df43d8bf1031 ("net: replace dev_addr_sem with netdev instance lock")
Signed-off-by: Kohei Enju <enjuk@...zon.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@...zon.com>
---
net/core/dev.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index c4cc33f73629..df9661961558 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11003,17 +11003,11 @@ int register_netdevice(struct net_device *dev)
goto err_ifindex_release;
ret = netdev_register_kobject(dev);
-
- netdev_lock(dev);
WRITE_ONCE(dev->reg_state, ret ? NETREG_UNREGISTERED : NETREG_REGISTERED);
- netdev_unlock(dev);
-
if (ret)
goto err_uninit_notify;
- netdev_lock_ops(dev);
__netdev_update_features(dev);
- netdev_unlock_ops(dev);
/*
* Default initial state at registry is that the
--
2.48.1
Powered by blists - more mailing lists