lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAO9qdTFQHmM_zvbtYqVLGTMrBaovkmC7AM0hsU26iacmx-bM=g@mail.gmail.com>
Date: Thu, 1 Aug 2024 20:04:42 +0900
From: Jeongjun Park <aha310510@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: jiri@...nulli.us, davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com, 
	nicolas.dichtel@...nd.com, liuhangbin@...il.com, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org, 
	syzbot+44623300f057a28baf1e@...kaller.appspotmail.com, 
	syzbot+b668da2bc4cb9670bf58@...kaller.appspotmail.com
Subject: Re: [PATCH net,v2] rtnetlink: fix possible deadlock in team_port_change_check

Eric Dumazet wrote:
>
> On Wed, Jul 31, 2024 at 5:10 PM Jeongjun Park <aha310510@...il.com> wrote:
> >
> > In do_setlink() , do_set_master() is called when dev->flags does not have
> > the IFF_UP flag set, so 'team->lock' is acquired and dev_open() is called,
> > which generates the NETDEV_UP event. This causes a deadlock as it tries to
> > acquire 'team->lock' again.
> >
> > To solve this, we need to unlock 'team->lock' before calling dev_open()
> > in team_port_add() and then reacquire the lock when dev_open() returns.
> > Since the implementation acquires the lock in advance when the team
> > structure is used inside dev_open(), data races will not occur even if it
> > is briefly unlocked.
> >
> > ============================================
> > WARNING: possible recursive locking detected
> > 6.11.0-rc1-syzkaller-ge4fc196f5ba3-dirty #0 Not tainted
> > --------------------------------------------
> > syz.0.15/5889 is trying to acquire lock:
> > ffff8880231e4d40 (team->team_lock_key#2){+.+.}-{3:3}, at: team_port_change_check drivers/net/team/team_core.c:2950 [inline]
> > ffff8880231e4d40 (team->team_lock_key#2){+.+.}-{3:3}, at: team_device_event+0x2c7/0x770 drivers/net/team/team_core.c:2973
> >
> > but task is already holding lock:
> > ffff8880231e4d40 (team->team_lock_key#2){+.+.}-{3:3}, at: team_add_slave+0x9c/0x20e0 drivers/net/team/team_core.c:1975
> >
> > other info that might help us debug this:
> >  Possible unsafe locking scenario:
> >
> >        CPU0
> >        ----
> >   lock(team->team_lock_key#2);
> >   lock(team->team_lock_key#2);
> >
> >  *** DEADLOCK ***
> >
> >  May be due to missing lock nesting notation
> >
> > 2 locks held by syz.0.15/5889:
> >  #0: ffffffff8fa1f4e8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
> >  #0: ffffffff8fa1f4e8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x372/0xea0 net/core/rtnetlink.c:6644
> >  #1: ffff8880231e4d40 (team->team_lock_key#2){+.+.}-{3:3}, at: team_add_slave+0x9c/0x20e0 drivers/net/team/team_core.c:1975
> >
> > stack backtrace:
> > CPU: 1 UID: 0 PID: 5889 Comm: syz.0.15 Not tainted 6.11.0-rc1-syzkaller-ge4fc196f5ba3-dirty #0
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > Call Trace:
> >  <TASK>
> >  __dump_stack lib/dump_stack.c:93 [inline]
> >  dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:119
> >  check_deadlock kernel/locking/lockdep.c:3061 [inline]
> >  validate_chain kernel/locking/lockdep.c:3855 [inline]
> >  __lock_acquire+0x2167/0x3cb0 kernel/locking/lockdep.c:5142
> >  lock_acquire kernel/locking/lockdep.c:5759 [inline]
> >  lock_acquire+0x1b1/0x560 kernel/locking/lockdep.c:5724
> >  __mutex_lock_common kernel/locking/mutex.c:608 [inline]
> >  __mutex_lock+0x175/0x9c0 kernel/locking/mutex.c:752
> >  team_port_change_check drivers/net/team/team_core.c:2950 [inline]
> >  team_device_event+0x2c7/0x770 drivers/net/team/team_core.c:2973
> >  notifier_call_chain+0xb9/0x410 kernel/notifier.c:93
> >  call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1994
> >  call_netdevice_notifiers_extack net/core/dev.c:2032 [inline]
> >  call_netdevice_notifiers net/core/dev.c:2046 [inline]
> >  __dev_notify_flags+0x12d/0x2e0 net/core/dev.c:8876
> >  dev_change_flags+0x10c/0x160 net/core/dev.c:8914
> >  vlan_device_event+0xdfc/0x2120 net/8021q/vlan.c:468
> >  notifier_call_chain+0xb9/0x410 kernel/notifier.c:93
> >  call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1994
> >  call_netdevice_notifiers_extack net/core/dev.c:2032 [inline]
> >  call_netdevice_notifiers net/core/dev.c:2046 [inline]
> >  dev_open net/core/dev.c:1515 [inline]
> >  dev_open+0x144/0x160 net/core/dev.c:1503
> >  team_port_add drivers/net/team/team_core.c:1216 [inline]
> >  team_add_slave+0xacd/0x20e0 drivers/net/team/team_core.c:1976
> >  do_set_master+0x1bc/0x230 net/core/rtnetlink.c:2701
> >  do_setlink+0x306d/0x4060 net/core/rtnetlink.c:2907
> >  __rtnl_newlink+0xc35/0x1960 net/core/rtnetlink.c:3696
> >  rtnl_newlink+0x67/0xa0 net/core/rtnetlink.c:3743
> >  rtnetlink_rcv_msg+0x3c7/0xea0 net/core/rtnetlink.c:6647
> >  netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2550
> >  netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
> >  netlink_unicast+0x544/0x830 net/netlink/af_netlink.c:1357
> >  netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1901
> >  sock_sendmsg_nosec net/socket.c:730 [inline]
> >  __sock_sendmsg net/socket.c:745 [inline]
> >  ____sys_sendmsg+0xab5/0xc90 net/socket.c:2597
> >  ___sys_sendmsg+0x135/0x1e0 net/socket.c:2651
> >  __sys_sendmsg+0x117/0x1f0 net/socket.c:2680
> >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >  do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
> >  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > RIP: 0033:0x7fc07ed77299
> > Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
> > RSP: 002b:00007fc07fb7f048 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> > RAX: ffffffffffffffda RBX: 00007fc07ef05f80 RCX: 00007fc07ed77299
> > RDX: 0000000000000000 RSI: 0000000020000600 RDI: 0000000000000012
> > RBP: 00007fc07ede48e6 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> > R13: 000000000000000b R14: 00007fc07ef05f80 R15: 00007ffeb5c0d528
> >
> > Reported-by: syzbot+b668da2bc4cb9670bf58@...kaller.appspotmail.com
> > Fixes: ec4ffd100ffb ("Revert "net: rtnetlink: Enslave device before bringing it up"")
> > Signed-off-by: Jeongjun Park <aha310510@...il.com>
> > ---
> >  drivers/net/team/team_core.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
> > index ab1935a4aa2c..ee595c3c6624 100644
> > --- a/drivers/net/team/team_core.c
> > +++ b/drivers/net/team/team_core.c
> > @@ -1212,8 +1212,9 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
> >                            portname);
> >                 goto err_port_enter;
> >         }
> > -
> > +       mutex_unlock(&team->lock);
>
> Why would this be safe ?
>
> All checks done in team_port_add() before this point would need to be
> redone after mutex_lock() ?
>
> If another mutex (rtnl ?) is already protecting this path, this would
> suggest team->lock should be removed,
> and RTNL should be used in all needed paths.
>

If so, I will rewrite the patch and send by modifying
team_port_change_check.

Regards,
Jeongjun Park

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ