netdev - Re: INFO: task hung in linkwatch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <fb8b5864-18da-f2b0-2ee0-5cb78f7edfec@huawei.com>
Date:   Wed, 6 May 2020 19:20:50 +0800
From:   Yunsheng Lin <linyunsheng@...wei.com>
To:     Hillf Danton <hdanton@...a.com>
CC:     syzbot <syzbot+96ff6cfc4551fcc29342@...kaller.appspotmail.com>,
        <allison@...utok.net>, <aviad.krawczyk@...wei.com>,
        <axboe@...nel.dk>, <davem@...emloft.net>,
        <gregkh@...uxfoundation.org>, <io-uring@...r.kernel.org>,
        <kuba@...nel.org>, <linux-fsdevel@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <luobin9@...wei.com>,
        <netdev@...r.kernel.org>, <syzkaller-bugs@...glegroups.com>,
        <tglx@...utronix.de>, <viro@...iv.linux.org.uk>
Subject: Re: INFO: task hung in linkwatch_event (2)

On 2020/5/6 12:25, Hillf Danton wrote:
> 
> On Wed, 6 May 2020 09:38:21 Yunsheng Lin wrote:
>>
>> On 2020/4/29 17:59, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit:    b4f63322 Merge branch 'for-linus' of git://git.kernel.org/..
>>> git tree:       upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1558936fe00000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=b7a70e992f2f9b68
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=96ff6cfc4551fcc29342
>>> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=14a57828100000
>>>
>>> The bug was bisected to:
>>>
>>> commit 386d4716fd91869e07c731657f2cde5a33086516
>>> Author: Luo bin <luobin9@...wei.com>
>>> Date:   Thu Feb 27 06:34:44 2020 +0000
>>>
>>>     hinic: fix a bug of rss configuration
>>
>> The above patch does not seem to be the cause of the crash.
>>
>> From the below call trace, it seems the blocking is caused by
>> the tun_detach() which need to flush the all the pending work
> 
> queued on system_highpri_wq
> 
>> for each online cpu, it is the linkwatch_work that need to be
>> flushed in this crash case.
> 
> Not so sure it's linkwatch_work because it's on system_wq.

Yes, you are right. The work of flush_backlog() is queued on
system_highpri_wq, and maybe that is the work tun_detach() is
trying to flush.

So the tun_detach is flushing a work queued on the system_highpri_wq
while holding the RTNL lock, and linkwatch_event work is running to
try to take the RTNL lock, do they compete for the same worker in the
same cpu even they are queued for different wq? I do not understand
wq very well, if Yes, there may be a dead loop here?

> 
>> But the linkwatch_work() need to take
>> RTNL lock, which is already taken by the tun_detach(), and that is
>> where the blocking is happening.
>>
>> Possible way to fix or avoid this:
>> 1. Call flush_all_backlogs() without holding the RTNL lock, I am not
>>    sure it is safe to do this.
>> 2. Disabling adding link event to the unregisterring netdev, and flush
>>    all the pending link event without taking RTNL lock before calling
>>    unregister_netdevice() in tun_detach().
>>
>> Any better suggestion? Thanks.
>>
> Not before some extra info about what's going on the highpri wq is available.
> 
>>>
>>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=16626fcfe00000
>>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=15626fcfe00000
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=11626fcfe00000
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+96ff6cfc4551fcc29342@...kaller.appspotmail.com
>>> Fixes: 386d4716fd91 ("hinic: fix a bug of rss configuration")
>>>
>>> INFO: task kworker/1:5:2724 blocked for more than 143 seconds.
>>>       Not tainted 5.7.0-rc2-syzkaller #0
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> kworker/1:5     D27416  2724      2 0x80004000
>>> Workqueue: events linkwatch_event
>>> Call Trace:
>>>  schedule+0xd0/0x2a0 kernel/sched/core.c:4163
>>>  schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:4222
>>>  __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
>>>  __mutex_lock+0x7ab/0x13c0 kernel/locking/mutex.c:1103
>>>  linkwatch_event+0xb/0x60 net/core/link_watch.c:242
>>>  process_one_work+0x965/0x16a0 kernel/workqueue.c:2268
>>>  worker_thread+0x96/0xe20 kernel/workqueue.c:2414
>>>  kthread+0x388/0x470 kernel/kthread.c:268
>>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
>>> INFO: task syz-executor.0:7053 blocked for more than 143 seconds.
>>>       Not tainted 5.7.0-rc2-syzkaller #0
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> syz-executor.0  D23512  7053      1 0x80004006
>>> Call Trace:
>>>  schedule+0xd0/0x2a0 kernel/sched/core.c:4163
>>>  schedule_timeout+0x55b/0x850 kernel/time/timer.c:1874
>>>  do_wait_for_common kernel/sched/completion.c:85 [inline]
>>>  __wait_for_common kernel/sched/completion.c:106 [inline]
>>>  wait_for_common kernel/sched/completion.c:117 [inline]
>>>  wait_for_completion+0x16a/0x270 kernel/sched/completion.c:138
>>>  __flush_work+0x4fd/0xa80 kernel/workqueue.c:3045
>>>  flush_all_backlogs net/core/dev.c:5527 [inline]
>>>  rollback_registered_many+0x562/0xe70 net/core/dev.c:8813
>>>  rollback_registered+0xf2/0x1c0 net/core/dev.c:8873
>>>  unregister_netdevice_queue net/core/dev.c:9969 [inline]
>>>  unregister_netdevice_queue+0x1d7/0x2b0 net/core/dev.c:9962
>>>  unregister_netdevice include/linux/netdevice.h:2725 [inline]
>>>  __tun_detach+0xe42/0x1110 drivers/net/tun.c:690
>>>  tun_detach drivers/net/tun.c:707 [inline]
>>>  tun_chr_close+0xd9/0x180 drivers/net/tun.c:3413
>>>  __fput+0x33e/0x880 fs/file_table.c:280
>>>  task_work_run+0xf4/0x1b0 kernel/task_work.c:123
>>>  exit_task_work include/linux/task_work.h:22 [inline]
>>>  do_exit+0xb34/0x2dd0 kernel/exit.c:795
>>>  do_group_exit+0x125/0x340 kernel/exit.c:893
>>>  get_signal+0x47b/0x24e0 kernel/signal.c:2739
>>>  do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
>>>  exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
>>>  prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
>>>  syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
>>>  do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
>>> RIP: 0033:0x4166ca
>>> Code: Bad RIP value.
>>> RSP: 002b:00007ffd4022d478 EFLAGS: 00000246 ORIG_RAX: 000000000000003d
>>> RAX: fffffffffffffe00 RBX: 0000000001d60940 RCX: 00000000004166ca
>>> RDX: 0000000040000000 RSI: 00007ffd4022d4b0 RDI: ffffffffffffffff
>>> RBP: 0000000000002996 R08: 0000000000000001 R09: 0000000000000001
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
>>> R13: 00007ffd4022d4b0 R14: 0000000001d6099b R15: 00007ffd4022d4c0
>>>
>>> Showing all locks held in the system:
>>> 1 lock held by khungtaskd/1125:
>>>  #0: ffffffff899beb00 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:5754
>>> 3 locks held by kworker/1:5/2724:
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: __write_once_size include/linux/compiler.h:226 [inline]
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:615 [inline]
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>>>  #0: ffff8880aa026d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x844/0x16a0 kernel/workqueue.c:2239
>>>  #1: ffffc90008367dc0 ((linkwatch_work).work){+.+.}-{0:0}, at: process_one_work+0x878/0x16a0 kernel/workqueue.c:2243
>>>  #2: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: linkwatch_event+0xb/0x60 net/core/link_watch.c:242
>>> 1 lock held by in:imklog/6717:
>>>  #0: ffff888098d271b0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:826
>>> 2 locks held by syz-executor.0/7053:
>>>  #0: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: tun_detach drivers/net/tun.c:704 [inline]
>>>  #0: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: tun_chr_close+0x3a/0x180 drivers/net/tun.c:3413
>>>  #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: get_online_cpus include/linux/cpu.h:143 [inline]
>>>  #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: flush_all_backlogs net/core/dev.c:5520 [inline]
>>>  #1: ffffffff89979ad0 (cpu_hotplug_lock){++++}-{0:0}, at: rollback_registered_many+0x45b/0xe70 net/core/dev.c:8813
>>> 3 locks held by kworker/1:6/14336:
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: __write_once_size include/linux/compiler.h:226 [inline]
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:855 [inline]
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:615 [inline]
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>>>  #0: ffff88809ace8d38 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: process_one_work+0x844/0x16a0 kernel/workqueue.c:2239
>>>  #1: ffffc90004637dc0 ((addr_chk_work).work){+.+.}-{0:0}, at: process_one_work+0x878/0x16a0 kernel/workqueue.c:2243
>>>  #2: ffffffff8a582268 (rtnl_mutex){+.+.}-{3:3}, at: addrconf_verify_work+0xa/0x20 net/ipv6/addrconf.c:4584
>>>
>>> =============================================
>>>
>>> NMI backtrace for cpu 1
>>> CPU: 1 PID: 1125 Comm: khungtaskd Not tainted 5.7.0-rc2-syzkaller #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>  dump_stack+0x188/0x20d lib/dump_stack.c:118
>>>  nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
>>>  nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62
>>>  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
>>>  check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
>>>  watchdog+0xa8c/0x1010 kernel/hung_task.c:289
>>>  kthread+0x388/0x470 kernel/kthread.c:268
>>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
>>> Sending NMI from CPU 1 to CPUs 0:
>>> NMI backtrace for cpu 0
>>> CPU: 0 PID: 28894 Comm: syz-executor.0 Not tainted 5.7.0-rc2-syzkaller #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> RIP: 0010:io_ring_ctx_wait_and_kill+0x98/0x5e0 fs/io_uring.c:7329
>>> Code: 01 00 00 4d 89 f4 48 b8 00 00 00 00 00 fc ff df 4c 89 ed 49 c1 ec 03 48 c1 ed 03 49 01 c4 48 01 c5 eb 1c e8 6a f2 9d ff f3 90 <41> 80 3c 24 00 0f 85 b0 04 00 00 48 83 bb 10 01 00 00 00 74 21 e8
>>> RSP: 0018:ffffc90004e17a48 EFLAGS: 00000293
>>> RAX: ffff888091758480 RBX: ffff888094860000 RCX: 1ffff920009c2f36
>>> RDX: 0000000000000000 RSI: ffffffff81d53c26 RDI: ffff888094860300
>>> RBP: ffffed101290c02c R08: 0000000000000001 R09: ffffed101290c061
>>> R10: ffff888094860307 R11: ffffed101290c060 R12: ffffed101290c022
>>> R13: ffff888094860160 R14: ffff888094860110 R15: ffffffff81d54170
>>> FS:  00007fac6c1a8700(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 0000560ad6a654a7 CR3: 0000000009879000 CR4: 00000000001406f0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>> Call Trace:
>>>  io_uring_release+0x3e/0x50 fs/io_uring.c:7352
>>>  __fput+0x33e/0x880 fs/file_table.c:280
>>>  task_work_run+0xf4/0x1b0 kernel/task_work.c:123
>>>  exit_task_work include/linux/task_work.h:22 [inline]
>>>  do_exit+0xb34/0x2dd0 kernel/exit.c:795
>>>  do_group_exit+0x125/0x340 kernel/exit.c:893
>>>  get_signal+0x47b/0x24e0 kernel/signal.c:2739
>>>  do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
>>>  exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
>>>  prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
>>>  syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
>>>  do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xb3
>>> RIP: 0033:0x45c829
>>> Code: Bad RIP value.
>>> RSP: 002b:00007fac6c1a7c78 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
>>> RAX: 0000000000000003 RBX: 00000000004e0bc0 RCX: 000000000045c829
>>> RDX: 0000000000000000 RSI: 0000000020000580 RDI: 00000000000000f1
>>> RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
>>> R13: 0000000000000204 R14: 00000000004c425f R15: 00007fac6c1a86d4
>>>
>>>
>>> ---
>>> This bug is generated by a bot. It may contain errors.
>>> See https://goo.gl/tpsmEJ for more information about syzbot.
>>> syzbot engineers can be reached at syzkaller@...glegroups.com.
>>>
>>> syzbot will keep track of this bug report. See:
>>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>>> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
>>> syzbot can test patches for this bug, for details see:
>>> https://goo.gl/tpsmEJ#testing-patches
> 
> .
>