linux-kernel - Re: [PATCH net,v2] rtnetlink: fix possible deadlock in team_port_change

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZqucPsGsnwZ2QOXa@nanopsycho.orion>
Date: Thu, 1 Aug 2024 16:31:26 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Jeongjun Park <aha310510@...il.com>, davem@...emloft.net,
	kuba@...nel.org, pabeni@...hat.com, nicolas.dichtel@...nd.com,
	liuhangbin@...il.com, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	syzbot+44623300f057a28baf1e@...kaller.appspotmail.com,
	syzbot+b668da2bc4cb9670bf58@...kaller.appspotmail.com
Subject: Re: [PATCH net,v2] rtnetlink: fix possible deadlock in
 team_port_change_check

Thu, Aug 01, 2024 at 08:28:20AM CEST, edumazet@...gle.com wrote:
>On Wed, Jul 31, 2024 at 5:10 PM Jeongjun Park <aha310510@...il.com> wrote:
>>
>> In do_setlink() , do_set_master() is called when dev->flags does not have
>> the IFF_UP flag set, so 'team->lock' is acquired and dev_open() is called,
>> which generates the NETDEV_UP event. This causes a deadlock as it tries to
>> acquire 'team->lock' again.
>>
>> To solve this, we need to unlock 'team->lock' before calling dev_open()
>> in team_port_add() and then reacquire the lock when dev_open() returns.
>> Since the implementation acquires the lock in advance when the team
>> structure is used inside dev_open(), data races will not occur even if it
>> is briefly unlocked.
>>
>> ============================================
>> WARNING: possible recursive locking detected
>> 6.11.0-rc1-syzkaller-ge4fc196f5ba3-dirty #0 Not tainted
>> --------------------------------------------
>> syz.0.15/5889 is trying to acquire lock:
>> ffff8880231e4d40 (team->team_lock_key#2){+.+.}-{3:3}, at: team_port_change_check drivers/net/team/team_core.c:2950 [inline]
>> ffff8880231e4d40 (team->team_lock_key#2){+.+.}-{3:3}, at: team_device_event+0x2c7/0x770 drivers/net/team/team_core.c:2973
>>
>> but task is already holding lock:
>> ffff8880231e4d40 (team->team_lock_key#2){+.+.}-{3:3}, at: team_add_slave+0x9c/0x20e0 drivers/net/team/team_core.c:1975
>>
>> other info that might help us debug this:
>>  Possible unsafe locking scenario:
>>
>>        CPU0
>>        ----
>>   lock(team->team_lock_key#2);
>>   lock(team->team_lock_key#2);
>>
>>  *** DEADLOCK ***
>>
>>  May be due to missing lock nesting notation
>>
>> 2 locks held by syz.0.15/5889:
>>  #0: ffffffff8fa1f4e8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
>>  #0: ffffffff8fa1f4e8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x372/0xea0 net/core/rtnetlink.c:6644
>>  #1: ffff8880231e4d40 (team->team_lock_key#2){+.+.}-{3:3}, at: team_add_slave+0x9c/0x20e0 drivers/net/team/team_core.c:1975
>>
>> stack backtrace:
>> CPU: 1 UID: 0 PID: 5889 Comm: syz.0.15 Not tainted 6.11.0-rc1-syzkaller-ge4fc196f5ba3-dirty #0
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
>> Call Trace:
>>  <TASK>
>>  __dump_stack lib/dump_stack.c:93 [inline]
>>  dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:119
>>  check_deadlock kernel/locking/lockdep.c:3061 [inline]
>>  validate_chain kernel/locking/lockdep.c:3855 [inline]
>>  __lock_acquire+0x2167/0x3cb0 kernel/locking/lockdep.c:5142
>>  lock_acquire kernel/locking/lockdep.c:5759 [inline]
>>  lock_acquire+0x1b1/0x560 kernel/locking/lockdep.c:5724
>>  __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>>  __mutex_lock+0x175/0x9c0 kernel/locking/mutex.c:752
>>  team_port_change_check drivers/net/team/team_core.c:2950 [inline]
>>  team_device_event+0x2c7/0x770 drivers/net/team/team_core.c:2973
>>  notifier_call_chain+0xb9/0x410 kernel/notifier.c:93
>>  call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1994
>>  call_netdevice_notifiers_extack net/core/dev.c:2032 [inline]
>>  call_netdevice_notifiers net/core/dev.c:2046 [inline]
>>  __dev_notify_flags+0x12d/0x2e0 net/core/dev.c:8876
>>  dev_change_flags+0x10c/0x160 net/core/dev.c:8914
>>  vlan_device_event+0xdfc/0x2120 net/8021q/vlan.c:468
>>  notifier_call_chain+0xb9/0x410 kernel/notifier.c:93
>>  call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1994
>>  call_netdevice_notifiers_extack net/core/dev.c:2032 [inline]
>>  call_netdevice_notifiers net/core/dev.c:2046 [inline]
>>  dev_open net/core/dev.c:1515 [inline]
>>  dev_open+0x144/0x160 net/core/dev.c:1503
>>  team_port_add drivers/net/team/team_core.c:1216 [inline]
>>  team_add_slave+0xacd/0x20e0 drivers/net/team/team_core.c:1976
>>  do_set_master+0x1bc/0x230 net/core/rtnetlink.c:2701
>>  do_setlink+0x306d/0x4060 net/core/rtnetlink.c:2907
>>  __rtnl_newlink+0xc35/0x1960 net/core/rtnetlink.c:3696
>>  rtnl_newlink+0x67/0xa0 net/core/rtnetlink.c:3743
>>  rtnetlink_rcv_msg+0x3c7/0xea0 net/core/rtnetlink.c:6647
>>  netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2550
>>  netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
>>  netlink_unicast+0x544/0x830 net/netlink/af_netlink.c:1357
>>  netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1901
>>  sock_sendmsg_nosec net/socket.c:730 [inline]
>>  __sock_sendmsg net/socket.c:745 [inline]
>>  ____sys_sendmsg+0xab5/0xc90 net/socket.c:2597
>>  ___sys_sendmsg+0x135/0x1e0 net/socket.c:2651
>>  __sys_sendmsg+0x117/0x1f0 net/socket.c:2680
>>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>  do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
>>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> RIP: 0033:0x7fc07ed77299
>> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007fc07fb7f048 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
>> RAX: ffffffffffffffda RBX: 00007fc07ef05f80 RCX: 00007fc07ed77299
>> RDX: 0000000000000000 RSI: 0000000020000600 RDI: 0000000000000012
>> RBP: 00007fc07ede48e6 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>> R13: 000000000000000b R14: 00007fc07ef05f80 R15: 00007ffeb5c0d528
>>
>> Reported-by: syzbot+b668da2bc4cb9670bf58@...kaller.appspotmail.com
>> Fixes: ec4ffd100ffb ("Revert "net: rtnetlink: Enslave device before bringing it up"")
>> Signed-off-by: Jeongjun Park <aha310510@...il.com>
>> ---
>>  drivers/net/team/team_core.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/team/team_core.c b/drivers/net/team/team_core.c
>> index ab1935a4aa2c..ee595c3c6624 100644
>> --- a/drivers/net/team/team_core.c
>> +++ b/drivers/net/team/team_core.c
>> @@ -1212,8 +1212,9 @@ static int team_port_add(struct team *team, struct net_device *port_dev,
>>                            portname);
>>                 goto err_port_enter;
>>         }
>> -
>> +       mutex_unlock(&team->lock);
>
>Why would this be safe ?
>
>All checks done in team_port_add() before this point would need to be
>redone after mutex_lock() ?
>
>If another mutex (rtnl ?) is already protecting this path, this would
>suggest team->lock should be removed,
>and RTNL should be used in all needed paths.

I agree. The problem is, not all other paths rely on rtnl, they rely on
team->lock instead. I believe that the best solution is to remove
team->lock entirely.

I looked into it, the only problem I see is team_nl_fill_one_option_get()
function. That operates without rtnl being held. Taking rtnl here is too
heavy, given that it may be repeatedly used to fetch hash stats.
It's read only, could be probably converted to RCU. It's in my todo list,
once I have some spare cycles.




>
>>         err = dev_open(port_dev, extack);
>> +       mutex_lock(&team->lock);
>
>
>
>>         if (err) {
>>                 netdev_dbg(dev, "Device %s opening failed\n",
>>                            portname);
>> --