lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+bUNeDM5SeZa0cQSuSJ0rq3i0Y063wK9cUvG9oxuQP7YQ@mail.gmail.com>
Date:   Sun, 29 Jan 2017 11:11:15 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     syzkaller <syzkaller@...glegroups.com>,
        Eric Dumazet <edumazet@...gle.com>,
        David Miller <davem@...emloft.net>,
        Matti Vaittinen <matti.vaittinen@...ia.com>,
        Tycho Andersen <tycho.andersen@...onical.com>,
        Florian Westphal <fw@...len.de>,
        stephen hemminger <stephen@...workplumber.org>,
        Tom Herbert <tom@...bertland.com>,
        netdev <netdev@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Richard Guy Briggs <rgb@...hat.com>,
        netdev-owner@...r.kernel.org
Subject: Re: net: deadlock on genl_mutex

On Fri, Dec 9, 2016 at 6:08 AM, Cong Wang <xiyou.wangcong@...il.com> wrote:
>>> Chain exists of:
>>>  Possible unsafe locking scenario:
>>>
>>>        CPU0                    CPU1
>>>        ----                    ----
>>>   lock(genl_mutex);
>>>                                lock(nlk->cb_mutex);
>>>                                lock(genl_mutex);
>>>   lock(rtnl_mutex);
>>>
>>>  *** DEADLOCK ***
>>
>> This one looks legitimate, because nlk->cb_mutex could be rtnl_mutex.
>> Let me think about it.
>
> Never mind. Actually both reports in this thread are legitimate.
>
> I know what happened now, the lock chain is so long, 4 locks are involved
> to form a chain!!!
>
> Let me think about how to break the chain.


Cong, any success with breaking the chain?

Still happenning on f0ad17712b9f71c24e2b8b9725230ef57232377f. Or is it
a different one?


[ INFO: possible circular locking dependency detected ]
4.10.0-rc3+ #4 Not tainted
-------------------------------------------------------
syz-executor9/2705 is trying to acquire lock:
 (genl_mutex){+.+.+.}, at: [<ffffffff836f58fe>] genl_lock
net/netlink/genetlink.c:32 [inline]
 (genl_mutex){+.+.+.}, at: [<ffffffff836f58fe>]
genl_family_rcv_msg+0xdae/0x1040 net/netlink/genetlink.c:547

but task is already holding lock:
 (rtnl_mutex){+.+.+.}, at: [<ffffffff836416e7>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:70

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (rtnl_mutex){+.+.+.}:

[<ffffffff8157e729>] validate_chain kernel/locking/lockdep.c:2265 [inline]
[<ffffffff8157e729>] __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
[<ffffffff815808b1>] lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
[<ffffffff843f9de0>] __mutex_lock_common kernel/locking/mutex.c:639 [inline]
[<ffffffff843f9de0>] mutex_lock_nested+0x290/0x1730 kernel/locking/mutex.c:753
[<ffffffff836416e7>] rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70
[<ffffffff83fd5e9e>] nl80211_pre_doit+0x2fe/0x570 net/wireless/nl80211.c:11847
[<ffffffff836f52b0>] genl_family_rcv_msg+0x760/0x1040
net/netlink/genetlink.c:591
[<ffffffff836f807a>] genl_rcv_msg+0x19a/0x330 net/netlink/genetlink.c:620
[<ffffffff836f36cb>] netlink_rcv_skb+0x2ab/0x390 net/netlink/af_netlink.c:2298
[<ffffffff836f4b38>] genl_rcv+0x28/0x40 net/netlink/genetlink.c:631
[<ffffffff836f1f14>] netlink_unicast_kernel
net/netlink/af_netlink.c:1231 [inline]
[<ffffffff836f1f14>] netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1257
[<ffffffff836f2bcf>] netlink_sendmsg+0xa9f/0xe50 net/netlink/af_netlink.c:1803
[<ffffffff83572d3a>] sock_sendmsg_nosec net/socket.c:635 [inline]
[<ffffffff83572d3a>] sock_sendmsg+0xca/0x110 net/socket.c:645
[<ffffffff8357557a>] ___sys_sendmsg+0x8fa/0x9f0 net/socket.c:1985
[<ffffffff83578138>] __sys_sendmsg+0x138/0x300 net/socket.c:2019
[<ffffffff8357832d>] SYSC_sendmsg net/socket.c:2030 [inline]
[<ffffffff8357832d>] SyS_sendmsg+0x2d/0x50 net/socket.c:2026
[<ffffffff8440e7c1>] entry_SYSCALL_64_fastpath+0x1f/0xc2

-> #0 (genl_mutex){+.+.+.}:

[<ffffffff8157847f>] check_prev_add kernel/locking/lockdep.c:1828 [inline]
[<ffffffff8157847f>] check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1938
[<ffffffff8157e729>] validate_chain kernel/locking/lockdep.c:2265 [inline]
[<ffffffff8157e729>] __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
[<ffffffff815808b1>] lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
[<ffffffff843f9de0>] __mutex_lock_common kernel/locking/mutex.c:639 [inline]
[<ffffffff843f9de0>] mutex_lock_nested+0x290/0x1730 kernel/locking/mutex.c:753
[<ffffffff836f58fe>] genl_lock net/netlink/genetlink.c:32 [inline]
[<ffffffff836f58fe>] genl_family_rcv_msg+0xdae/0x1040
net/netlink/genetlink.c:547
[<ffffffff836f807a>] genl_rcv_msg+0x19a/0x330 net/netlink/genetlink.c:620
[<ffffffff836f36cb>] netlink_rcv_skb+0x2ab/0x390 net/netlink/af_netlink.c:2298
[<ffffffff836f4b38>] genl_rcv+0x28/0x40 net/netlink/genetlink.c:631
[<ffffffff836f1f14>] netlink_unicast_kernel
net/netlink/af_netlink.c:1231 [inline]
[<ffffffff836f1f14>] netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1257
[<ffffffff836f2bcf>] netlink_sendmsg+0xa9f/0xe50 net/netlink/af_netlink.c:1803
[<ffffffff83572d3a>] sock_sendmsg_nosec net/socket.c:635 [inline]
[<ffffffff83572d3a>] sock_sendmsg+0xca/0x110 net/socket.c:645
[<ffffffff835730a6>] sock_write_iter+0x326/0x600 net/socket.c:848
[<ffffffff81a3c493>] new_sync_write fs/read_write.c:499 [inline]
[<ffffffff81a3c493>] __vfs_write+0x483/0x740 fs/read_write.c:512
[<ffffffff81a42227>] vfs_write+0x187/0x530 fs/read_write.c:560
[<ffffffff81a4675b>] SYSC_write fs/read_write.c:607 [inline]
[<ffffffff81a4675b>] SyS_write+0xfb/0x230 fs/read_write.c:599
[<ffffffff8440e7c1>] entry_SYSCALL_64_fastpath+0x1f/0xc2

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(rtnl_mutex);
                               lock(genl_mutex);
                               lock(rtnl_mutex);
  lock(genl_mutex);

 *** DEADLOCK ***

2 locks held by syz-executor9/2705:
 #0:  (cb_lock){++++++}, at: [<ffffffff836f4b29>] genl_rcv+0x19/0x40
net/netlink/genetlink.c:630
 #1:  (rtnl_mutex){+.+.+.}, at: [<ffffffff836416e7>]
rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70

stack backtrace:
CPU: 1 PID: 2705 Comm: syz-executor9 Not tainted 4.10.0-rc3+ #4
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 print_circular_bug+0x307/0x3b0 kernel/locking/lockdep.c:1202
 check_prev_add kernel/locking/lockdep.c:1828 [inline]
 check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1938
 validate_chain kernel/locking/lockdep.c:2265 [inline]
 __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
 lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
 __mutex_lock_common kernel/locking/mutex.c:639 [inline]
 mutex_lock_nested+0x290/0x1730 kernel/locking/mutex.c:753
 genl_lock net/netlink/genetlink.c:32 [inline]
 genl_family_rcv_msg+0xdae/0x1040 net/netlink/genetlink.c:547
 genl_rcv_msg+0x19a/0x330 net/netlink/genetlink.c:620
 netlink_rcv_skb+0x2ab/0x390 net/netlink/af_netlink.c:2298
 genl_rcv+0x28/0x40 net/netlink/genetlink.c:631
 netlink_unicast_kernel net/netlink/af_netlink.c:1231 [inline]
 netlink_unicast+0x514/0x730 net/netlink/af_netlink.c:1257
 netlink_sendmsg+0xa9f/0xe50 net/netlink/af_netlink.c:1803
 sock_sendmsg_nosec net/socket.c:635 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:645
 sock_write_iter+0x326/0x600 net/socket.c:848
 new_sync_write fs/read_write.c:499 [inline]
 __vfs_write+0x483/0x740 fs/read_write.c:512
 vfs_write+0x187/0x530 fs/read_write.c:560
 SYSC_write fs/read_write.c:607 [inline]
 SyS_write+0xfb/0x230 fs/read_write.c:599
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x44f5e9
RSP: 002b:00007fdba138cb58 EFLAGS: 00000212 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000020000fdc RCX: 000000000044f5e9
RDX: 0000000000000024 RSI: 0000000020000fdc RDI: 0000000000000006
RBP: 0000000000000006 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000212 R12: 0000000000700000
R13: 0000000000000002 R14: 0000000000000010 R15: 0000000000000000

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ