[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aA7hwMhd3kyKpvUu@fedora>
Date: Mon, 28 Apr 2025 02:02:40 +0000
From: Hangbin Liu <liuhangbin@...il.com>
To: Wang Liang <wangliang74@...wei.com>
Cc: Stanislav Fomichev <sdf@...ichev.me>, netdev@...r.kernel.org,
davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, jv@...sburgh.net, andrew+netdev@...n.ch,
linux-kernel@...r.kernel.org,
syzbot+48c14f61594bdfadb086@...kaller.appspotmail.com
Subject: Re: [PATCH net v2] bonding: hold ops lock around get_link
On Sun, Apr 27, 2025 at 11:06:32AM +0800, Wang Liang wrote:
>
> 在 2025/4/11 0:11, Stanislav Fomichev 写道:
> > syzbot reports a case of ethtool_ops->get_link being called without
> > ops lock:
> >
> > ethtool_op_get_link+0x15/0x60 net/ethtool/ioctl.c:63
> > bond_check_dev_link+0x1fb/0x4b0 drivers/net/bonding/bond_main.c:864
> > bond_miimon_inspect drivers/net/bonding/bond_main.c:2734 [inline]
> > bond_mii_monitor+0x49d/0x3170 drivers/net/bonding/bond_main.c:2956
> > process_one_work kernel/workqueue.c:3238 [inline]
> > process_scheduled_works+0xac3/0x18e0 kernel/workqueue.c:3319
> > worker_thread+0x870/0xd50 kernel/workqueue.c:3400
> > kthread+0x7b7/0x940 kernel/kthread.c:464
> > ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:153
> > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> >
> > Commit 04efcee6ef8d ("net: hold instance lock during NETDEV_CHANGE")
> > changed to lockless __linkwatch_sync_dev in ethtool_op_get_link.
> > All paths except bonding are coming via locked ioctl. Add necessary
> > locking to bonding.
> >
> > Reviewed-by: Hangbin Liu <liuhangbin@...il.com>
> > Reported-by: syzbot+48c14f61594bdfadb086@...kaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=48c14f61594bdfadb086
> > Fixes: 04efcee6ef8d ("net: hold instance lock during NETDEV_CHANGE")
> > Signed-off-by: Stanislav Fomichev <sdf@...ichev.me>
> > ---
> > v2:
> > - move 'BMSR_LSTATUS : 0' part out (Jakub)
> > ---
> > drivers/net/bonding/bond_main.c | 13 +++++++++----
> > 1 file changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> > index 950d8e4d86f8..8ea183da8d53 100644
> > --- a/drivers/net/bonding/bond_main.c
> > +++ b/drivers/net/bonding/bond_main.c
> > @@ -850,8 +850,9 @@ static int bond_check_dev_link(struct bonding *bond,
> > struct net_device *slave_dev, int reporting)
> > {
> > const struct net_device_ops *slave_ops = slave_dev->netdev_ops;
> > - struct ifreq ifr;
> > struct mii_ioctl_data *mii;
> > + struct ifreq ifr;
> > + int ret;
> > if (!reporting && !netif_running(slave_dev))
> > return 0;
> > @@ -860,9 +861,13 @@ static int bond_check_dev_link(struct bonding *bond,
> > return netif_carrier_ok(slave_dev) ? BMSR_LSTATUS : 0;
> > /* Try to get link status using Ethtool first. */
> > - if (slave_dev->ethtool_ops->get_link)
> > - return slave_dev->ethtool_ops->get_link(slave_dev) ?
> > - BMSR_LSTATUS : 0;
> > + if (slave_dev->ethtool_ops->get_link) {
> > + netdev_lock_ops(slave_dev);
> > + ret = slave_dev->ethtool_ops->get_link(slave_dev);
> > + netdev_unlock_ops(slave_dev);
> > +
> > + return ret ? BMSR_LSTATUS : 0;
> > + }
> > /* Ethtool can't be used, fallback to MII ioctls. */
> > if (slave_ops->ndo_eth_ioctl) {
>
>
> Hello, I find that a WARNING still exists:
>
> RTNL: assertion failed at ./include/net/netdev_lock.h (56)
> WARNING: CPU: 1 PID: 3020 at ./include/net/netdev_lock.h:56
> netdev_ops_assert_locked include/net/netdev_lock.h:56 [inline]
> WARNING: CPU: 1 PID: 3020 at ./include/net/netdev_lock.h:56
> __linkwatch_sync_dev+0x30d/0x360 net/core/link_watch.c:279
> Modules linked in:
> CPU: 1 UID: 0 PID: 3020 Comm: kworker/u8:10 Not tainted
> 6.15.0-rc2-syzkaller-00257-gb5c6891b2c5b #0 PREEMPT(full)
> Hardware name: Google Compute Engine, BIOS Google 02/12/2025
> Workqueue: bond0 bond_mii_monitor
> RIP: 0010:netdev_ops_assert_locked include/net/netdev_lock.h:56 [inline]
>
> It is report by syzbot (link:
> https://syzkaller.appspot.com/bug?extid=48c14f61594bdfadb086).
>
> Because ASSERT_RTNL() failed in netdev_ops_assert_locked().
>
> I wonder if should add rtnl lock in bond_check_dev_link()?
>
> Like this:
>
> +++ b/drivers/net/bonding/bond_main.c
> @@ -862,10 +862,12 @@ static int bond_check_dev_link(struct bonding
> *bond,
>
> /* Try to get link status using Ethtool first. */
> if (slave_dev->ethtool_ops->get_link) {
> - netdev_lock_ops(slave_dev);
> - ret = slave_dev->ethtool_ops->get_link(slave_dev);
> - netdev_unlock_ops(slave_dev);
> -
> + if (rtnl_trylock()) {
> + netdev_lock_ops(slave_dev);
> + ret = slave_dev->ethtool_ops->get_link(slave_dev);
> + netdev_unlock_ops(slave_dev);
> + rtnl_unlock();
> + }
> return ret ? BMSR_LSTATUS : 0;
> }
>
What if rtnl_trylock() failed? This will return ret directly.
Maybe
if (slave_dev->ethtool_ops->get_link && rtnl_trylock()) {
netdev_lock_ops(slave_dev);
ret = slave_dev->ethtool_ops->get_link(slave_dev);
netdev_unlock_ops(slave_dev);
rtnl_unlock();
return ret ? BMSR_LSTATUS : 0;
}
Thanks
Hangbin
Powered by blists - more mailing lists