[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aEt6LvBMwUMxmUyx@mini-arch>
Date: Thu, 12 Jun 2025 18:09:02 -0700
From: Stanislav Fomichev <stfomichev@...il.com>
To: syzbot <syzbot+b8c48ea38ca27d150063@...kaller.appspotmail.com>
Cc: davem@...emloft.net, edumazet@...gle.com, horms@...nel.org,
kuba@...nel.org, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org, pabeni@...hat.com,
syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [net?] WARNING in __linkwatch_sync_dev (2)
On 06/11, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: f09079bd04a9 Merge tag 'powerpc-6.16-2' of git://git.kerne..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=16e9260c580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=e24211089078d6c6
> dashboard link: https://syzkaller.appspot.com/bug?extid=b8c48ea38ca27d150063
> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-f09079bd.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/ef68cb3d29a3/vmlinux-f09079bd.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/1cc9431b9a15/bzImage-f09079bd.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+b8c48ea38ca27d150063@...kaller.appspotmail.com
>
> ------------[ cut here ]------------
> RTNL: assertion failed at ./include/net/netdev_lock.h (72)
> WARNING: CPU: -1 PID: 1141 at ./include/net/netdev_lock.h:72 netdev_ops_assert_locked include/net/netdev_lock.h:72 [inline]
> WARNING: CPU: 0 PID: 1141 at ./include/net/netdev_lock.h:72 __linkwatch_sync_dev+0x1ed/0x230 net/core/link_watch.c:279
> ethtool_op_get_link+0x1d/0x70 net/ethtool/ioctl.c:63
> bond_check_dev_link+0x3f9/0x710 drivers/net/bonding/bond_main.c:863
> bond_miimon_inspect drivers/net/bonding/bond_main.c:2745 [inline]
> bond_mii_monitor+0x3c0/0x2dc0 drivers/net/bonding/bond_main.c:2967
> process_one_work+0x9cf/0x1b70 kernel/workqueue.c:3238
> process_scheduled_works kernel/workqueue.c:3321 [inline]
> worker_thread+0x6c8/0xf10 kernel/workqueue.c:3402
> kthread+0x3c5/0x780 kernel/kthread.c:464
> ret_from_fork+0x5d4/0x6f0 arch/x86/kernel/process.c:148
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> </TASK>
netdev_ops_assert_locked is called for non-ops-locked netdev and we
trigger ASSERT_RTNL case. Which is a bit misleading, but I noticed that
bond_miimon_inspect is running under rcu lock, which is not
gonna work for ops-locked devices :-/ (we want to grab instance
lock for the CHANGE notifiers).
I'm contemplating dropping rcu and doing try_lock rtnl. Looking at
commit f0c76d61779b ("bonding: refactor mii monitor"), it doesn't look
like there were issues with rtnl performance, so hopefully should be ok.
Because from my resent patches I remember this trace:
[ 3456.656261] ? ipv6_add_dev+0x370/0x620
[ 3456.660039] ipv6_find_idev+0x96/0xe0
[ 3456.660445] addrconf_add_dev+0x1e/0xa0
[ 3456.660861] addrconf_init_auto_addrs+0xb0/0x720
[ 3456.661803] addrconf_notify+0x35f/0x8d0
[ 3456.662236] notifier_call_chain+0x38/0xf0
[ 3456.662676] netdev_state_change+0x65/0x90
[ 3456.663112] linkwatch_do_dev+0x5a/0x70
Where linkwatch_do_dev (potentially called from ethtool_op_get_link and
bond_check_dev_link) might trigger ipv6 address assignment so I'm not
sure how this all supposed to work under rcu and without rtnl lock.
Tentatively (untested uncompiled):
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index c4d53e8e7c15..e2c4bcdb8b1a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2739,7 +2739,7 @@ static int bond_miimon_inspect(struct bonding *bond)
ignore_updelay = true;
}
- bond_for_each_slave_rcu(bond, slave, iter) {
+ bond_for_each_slave(bond, slave, iter) {
bond_propose_link_state(slave, BOND_LINK_NOCHANGE);
link_state = bond_check_dev_link(bond, slave->dev, 0);
@@ -2962,35 +2962,28 @@ static void bond_mii_monitor(struct work_struct *work)
if (!bond_has_slaves(bond))
goto re_arm;
- rcu_read_lock();
+ /* Race avoidance with bond_close cancel of workqueue */
+ if (!rtnl_trylock()) {
+ delay = 1;
+ should_notify_peers = false;
+ goto re_arm;
+ }
+
should_notify_peers = bond_should_notify_peers(bond);
commit = !!bond_miimon_inspect(bond);
if (bond->send_peer_notif) {
- rcu_read_unlock();
- if (rtnl_trylock()) {
- bond->send_peer_notif--;
- rtnl_unlock();
- }
- } else {
- rcu_read_unlock();
+ bond->send_peer_notif--;
}
if (commit) {
- /* Race avoidance with bond_close cancel of workqueue */
- if (!rtnl_trylock()) {
- delay = 1;
- should_notify_peers = false;
- goto re_arm;
- }
-
bond_for_each_slave(bond, slave, iter) {
bond_commit_link_state(slave, BOND_SLAVE_NOTIFY_LATER);
}
bond_miimon_commit(bond);
-
- rtnl_unlock(); /* might sleep, hold no other locks */
}
+ rtnl_unlock(); /* might sleep, hold no other locks */
+
re_arm:
if (bond->params.miimon)
queue_delayed_work(bond->wq, &bond->mii_work, delay);
Powered by blists - more mailing lists