lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Thu, 28 Mar 2024 23:37:01 +0100
From: Johannes Berg <johannes@...solutions.net>
To: syzbot <syzbot+7526b1c2ce0b9a92e9a6@...kaller.appspotmail.com>, 
	davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, 
	linux-kernel@...r.kernel.org, linux-wireless@...r.kernel.org, 
	netdev@...r.kernel.org, pabeni@...hat.com, syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [wireless?] possible deadlock in ieee80211_open

On Wed, 2024-03-27 at 07:52 -0700, syzbot wrote:
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.8.0-syzkaller-05204-g237bb5f7f7f5 #0 Not tainted
> ------------------------------------------------------
> syz-executor.0/7478 is trying to acquire lock:
> ffff888077110768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5951 [inline]
> ffff888077110768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
> 
> but task is already holding lock:
> ffff888064974d20 (team->team_lock_key#17){+.+.}-{3:3}, at: team_add_slave+0xad/0x2750 drivers/net/team/team.c:1973
> 
> which lock already depends on the new lock.

Hmm.

> the existing dependency chain (in reverse order) is:
> 
> -> #1 (team->team_lock_key#17){+.+.}-{3:3}:
>        lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
>        __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>        __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
>        team_port_change_check+0x51/0x1e0 drivers/net/team/team.c:2995
>        team_device_event+0x161/0x5b0 drivers/net/team/team.c:3021
>        notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
>        call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
>        call_netdevice_notifiers net/core/dev.c:2002 [inline]
>        dev_close_many+0x33c/0x4c0 net/core/dev.c:1543
>        unregister_netdevice_many_notify+0x544/0x16d0 net/core/dev.c:11071
>        macvlan_device_event+0x7bc/0x850 drivers/net/macvlan.c:1828
>        notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
>        call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
>        call_netdevice_notifiers net/core/dev.c:2002 [inline]
>        unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
>        unregister_netdevice_many net/core/dev.c:11154 [inline]
>        unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
>        unregister_netdevice include/linux/netdevice.h:3115 [inline]
>        _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
>        ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
>        ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
>        rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
>        cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847

So this was the interface being removed via nl80211 (why do we even do
that? rtnetlink can do that too ...)

I guess it was a team port, since team_port_get_rtnl() must've been non-
NULL for this netdev. That acquires the team->lock mutex, but we hold
the wiphy mutex around unregister_netdevice().

> -> #0 (&rdev->wiphy.mtx){+.+.}-{3:3}:
>        check_prev_add kernel/locking/lockdep.c:3134 [inline]
>        check_prevs_add kernel/locking/lockdep.c:3253 [inline]
>        validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
>        __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
>        lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
>        __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>        __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
>        wiphy_lock include/net/cfg80211.h:5951 [inline]
>        ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
>        __dev_open+0x2d3/0x450 net/core/dev.c:1430
>        dev_open+0xae/0x1b0 net/core/dev.c:1466
>        team_port_add drivers/net/team/team.c:1214 [inline]
>        team_add_slave+0x9b3/0x2750 drivers/net/team/team.c:1974
>        do_set_master net/core/rtnetlink.c:2685 [inline]
>        do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2891
>        __rtnl_newlink net/core/rtnetlink.c:3680 [inline]
>        rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3727
>        rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6595

I guess this was actually adding it as a team slave/port, which acquired
the team->lock mutex, but do_open acquires the wiphy lock.

We _don't_ hold the wiphy mutex around dev_close() when invoked in this
path (see nl80211_del_interface), but regardless of how we delete the
interface, we will hold wiphy mutex around the unregister.

Thing is, I'm not sure I see a good way to avoid that? Maybe we could
defer the unregister, and just set the ieee80211_ptr to NULL to make it
effectively dead for wireless in the meantime. Not sure.

However, as far as I can tell it's not actually possible for the
deadlock to happen, because _both_ paths will necessarily be holding the
RTNL around them - from nl80211 (nl80211_del_interface has
NL80211_FLAG_NEED_RTNL) and rtnetlink_rcv_msg() respectively.

So ultimately, we're both holding the mutex for internal reasons, but
given the outer RTNL, I don't see how this would really deadlock.

Given that, I'm inclined to ignore this, although it'd be nice to
silence lockdep about it somehow I guess?

johannes

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ