lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aA7hwMhd3kyKpvUu@fedora>
Date: Mon, 28 Apr 2025 02:02:40 +0000
From: Hangbin Liu <liuhangbin@...il.com>
To: Wang Liang <wangliang74@...wei.com>
Cc: Stanislav Fomichev <sdf@...ichev.me>, netdev@...r.kernel.org,
	davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
	pabeni@...hat.com, jv@...sburgh.net, andrew+netdev@...n.ch,
	linux-kernel@...r.kernel.org,
	syzbot+48c14f61594bdfadb086@...kaller.appspotmail.com
Subject: Re: [PATCH net v2] bonding: hold ops lock around get_link

On Sun, Apr 27, 2025 at 11:06:32AM +0800, Wang Liang wrote:
> 
> 在 2025/4/11 0:11, Stanislav Fomichev 写道:
> > syzbot reports a case of ethtool_ops->get_link being called without
> > ops lock:
> > 
> >   ethtool_op_get_link+0x15/0x60 net/ethtool/ioctl.c:63
> >   bond_check_dev_link+0x1fb/0x4b0 drivers/net/bonding/bond_main.c:864
> >   bond_miimon_inspect drivers/net/bonding/bond_main.c:2734 [inline]
> >   bond_mii_monitor+0x49d/0x3170 drivers/net/bonding/bond_main.c:2956
> >   process_one_work kernel/workqueue.c:3238 [inline]
> >   process_scheduled_works+0xac3/0x18e0 kernel/workqueue.c:3319
> >   worker_thread+0x870/0xd50 kernel/workqueue.c:3400
> >   kthread+0x7b7/0x940 kernel/kthread.c:464
> >   ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:153
> >   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> > 
> > Commit 04efcee6ef8d ("net: hold instance lock during NETDEV_CHANGE")
> > changed to lockless __linkwatch_sync_dev in ethtool_op_get_link.
> > All paths except bonding are coming via locked ioctl. Add necessary
> > locking to bonding.
> > 
> > Reviewed-by: Hangbin Liu <liuhangbin@...il.com>
> > Reported-by: syzbot+48c14f61594bdfadb086@...kaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=48c14f61594bdfadb086
> > Fixes: 04efcee6ef8d ("net: hold instance lock during NETDEV_CHANGE")
> > Signed-off-by: Stanislav Fomichev <sdf@...ichev.me>
> > ---
> > v2:
> > - move 'BMSR_LSTATUS : 0' part out (Jakub)
> > ---
> >   drivers/net/bonding/bond_main.c | 13 +++++++++----
> >   1 file changed, 9 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> > index 950d8e4d86f8..8ea183da8d53 100644
> > --- a/drivers/net/bonding/bond_main.c
> > +++ b/drivers/net/bonding/bond_main.c
> > @@ -850,8 +850,9 @@ static int bond_check_dev_link(struct bonding *bond,
> >   			       struct net_device *slave_dev, int reporting)
> >   {
> >   	const struct net_device_ops *slave_ops = slave_dev->netdev_ops;
> > -	struct ifreq ifr;
> >   	struct mii_ioctl_data *mii;
> > +	struct ifreq ifr;
> > +	int ret;
> >   	if (!reporting && !netif_running(slave_dev))
> >   		return 0;
> > @@ -860,9 +861,13 @@ static int bond_check_dev_link(struct bonding *bond,
> >   		return netif_carrier_ok(slave_dev) ? BMSR_LSTATUS : 0;
> >   	/* Try to get link status using Ethtool first. */
> > -	if (slave_dev->ethtool_ops->get_link)
> > -		return slave_dev->ethtool_ops->get_link(slave_dev) ?
> > -			BMSR_LSTATUS : 0;
> > +	if (slave_dev->ethtool_ops->get_link) {
> > +		netdev_lock_ops(slave_dev);
> > +		ret = slave_dev->ethtool_ops->get_link(slave_dev);
> > +		netdev_unlock_ops(slave_dev);
> > +
> > +		return ret ? BMSR_LSTATUS : 0;
> > +	}
> >   	/* Ethtool can't be used, fallback to MII ioctls. */
> >   	if (slave_ops->ndo_eth_ioctl) {
> 
> 
> Hello, I find that a WARNING still exists:
> 
>   RTNL: assertion failed at ./include/net/netdev_lock.h (56)
>   WARNING: CPU: 1 PID: 3020 at ./include/net/netdev_lock.h:56
> netdev_ops_assert_locked include/net/netdev_lock.h:56 [inline]
>   WARNING: CPU: 1 PID: 3020 at ./include/net/netdev_lock.h:56
> __linkwatch_sync_dev+0x30d/0x360 net/core/link_watch.c:279
>   Modules linked in:
>   CPU: 1 UID: 0 PID: 3020 Comm: kworker/u8:10 Not tainted
> 6.15.0-rc2-syzkaller-00257-gb5c6891b2c5b #0 PREEMPT(full)
>   Hardware name: Google Compute Engine, BIOS Google 02/12/2025
>   Workqueue: bond0 bond_mii_monitor
>   RIP: 0010:netdev_ops_assert_locked include/net/netdev_lock.h:56 [inline]
> 
> It is report by syzbot (link:
> https://syzkaller.appspot.com/bug?extid=48c14f61594bdfadb086).
> 
> Because ASSERT_RTNL() failed in netdev_ops_assert_locked().
> 
> I wonder if should add rtnl lock in bond_check_dev_link()?
> 
> Like this:
> 
>   +++ b/drivers/net/bonding/bond_main.c
>   @@ -862,10 +862,12 @@  static int bond_check_dev_link(struct bonding
> *bond,
> 
>        /* Try to get link status using Ethtool first. */
>        if (slave_dev->ethtool_ops->get_link) {
>   -        netdev_lock_ops(slave_dev);
>   -        ret = slave_dev->ethtool_ops->get_link(slave_dev);
>   -        netdev_unlock_ops(slave_dev);
>   -
>   +        if (rtnl_trylock()) {
>   +            netdev_lock_ops(slave_dev);
>   +            ret = slave_dev->ethtool_ops->get_link(slave_dev);
>   +            netdev_unlock_ops(slave_dev);
>   +            rtnl_unlock();
>   +        }
>            return ret ? BMSR_LSTATUS : 0;
>        }
> 

What if rtnl_trylock() failed? This will return ret directly.
Maybe
	if (slave_dev->ethtool_ops->get_link && rtnl_trylock()) {
		netdev_lock_ops(slave_dev);
		ret = slave_dev->ethtool_ops->get_link(slave_dev);
		netdev_unlock_ops(slave_dev);
		rtnl_unlock();
		return ret ? BMSR_LSTATUS : 0;
	}

Thanks
Hangbin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ