lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <39ebb181-6760-cdfd-88f8-5578ad4d7c85@gmail.com>
Date:   Mon, 1 Feb 2021 11:04:04 +0000
From:   Pavel Begunkov <asml.silence@...il.com>
To:     syzbot <syzbot+2f5d1785dc624932da78@...kaller.appspotmail.com>,
        axboe@...nel.dk, hdanton@...a.com, io-uring@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        syzkaller-bugs@...glegroups.com, viro@...iv.linux.org.uk
Subject: Re: WARNING in io_disable_sqo_submit

On 18/01/2021 12:46, syzbot wrote:
> Hello,
> 
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> INFO: task hung in io_sq_thread_stop

#syz test: git://git.kernel.dk/linux-block for-5.12/io_uring

> INFO: task kworker/u4:0:8 blocked for more than 143 seconds.
>       Not tainted 5.11.0-rc1-syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:kworker/u4:0    state:D stack:24056 pid:    8 ppid:     2 flags:0x00004000
> Workqueue: events_unbound io_ring_exit_work
> Call Trace:
>  context_switch kernel/sched/core.c:4313 [inline]
>  __schedule+0x90c/0x21a0 kernel/sched/core.c:5064
>  schedule+0xcf/0x270 kernel/sched/core.c:5143
>  schedule_timeout+0x1d8/0x250 kernel/time/timer.c:1854
>  do_wait_for_common kernel/sched/completion.c:85 [inline]
>  __wait_for_common kernel/sched/completion.c:106 [inline]
>  wait_for_common kernel/sched/completion.c:117 [inline]
>  wait_for_completion+0x163/0x260 kernel/sched/completion.c:138
>  kthread_park+0x122/0x1b0 kernel/kthread.c:557
>  io_sq_thread_park fs/io_uring.c:7445 [inline]
>  io_sq_thread_park fs/io_uring.c:7439 [inline]
>  io_sq_thread_stop+0xfe/0x570 fs/io_uring.c:7463
>  io_finish_async fs/io_uring.c:7481 [inline]
>  io_ring_ctx_free fs/io_uring.c:8646 [inline]
>  io_ring_exit_work+0x62/0x6d0 fs/io_uring.c:8739
>  process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
>  worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
> 
> Showing all locks held in the system:
> 3 locks held by kworker/u4:0/8:
>  #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>  #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
>  #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
>  #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
>  #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
>  #0: ffff888010069138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x871/0x15f0 kernel/workqueue.c:2246
>  #1: ffffc90000cd7da8 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x8a5/0x15f0 kernel/workqueue.c:2250
>  #2: ffff88801bfd4870 (&sqd->lock){+.+.}-{3:3}, at: io_sq_thread_park fs/io_uring.c:7444 [inline]
>  #2: ffff88801bfd4870 (&sqd->lock){+.+.}-{3:3}, at: io_sq_thread_park fs/io_uring.c:7439 [inline]
>  #2: ffff88801bfd4870 (&sqd->lock){+.+.}-{3:3}, at: io_sq_thread_stop+0xd6/0x570 fs/io_uring.c:7463
> 1 lock held by khungtaskd/1647:
>  #0: ffffffff8b373aa0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6254
> 1 lock held by in:imklog/8164:
>  #0: ffff8880151b8870 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:947
> 2 locks held by kworker/u4:6/8415:
> 2 locks held by kworker/0:4/8690:
>  #0: ffff88801007c538 ((wq_completion)rcu_gp){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>  #0: ffff88801007c538 ((wq_completion)rcu_gp){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
>  #0: ffff88801007c538 ((wq_completion)rcu_gp){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
>  #0: ffff88801007c538 ((wq_completion)rcu_gp){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
>  #0: ffff88801007c538 ((wq_completion)rcu_gp){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
>  #0: ffff88801007c538 ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x871/0x15f0 kernel/workqueue.c:2246
>  #1: ffffc9000288fda8 ((work_completion)(&rew.rew_work)){+.+.}-{0:0}, at: process_one_work+0x8a5/0x15f0 kernel/workqueue.c:2250
> 1 lock held by syz-executor.3/8865:
>  #0: ffff888146ddcd88 (&xt[i].mutex){+.+.}-{3:3}, at: xt_find_table_lock+0x41/0x540 net/netfilter/x_tables.c:1206
> 1 lock held by syz-executor.2/8867:
>  #0: ffff888146ddcd88 (&xt[i].mutex){+.+.}-{3:3}, at: xt_find_table_lock+0x41/0x540 net/netfilter/x_tables.c:1206
> 2 locks held by syz-executor.5/8869:
>  #0: ffff888146ddcd88 (&xt[i].mutex){+.+.}-{3:3}, at: xt_find_table_lock+0x41/0x540 net/netfilter/x_tables.c:1206
>  #1: ffffffff8b37c368 (rcu_state.exp_mutex){+.+.}-{3:3}, at: exp_funnel_lock kernel/rcu/tree_exp.h:290 [inline]
>  #1: ffffffff8b37c368 (rcu_state.exp_mutex){+.+.}-{3:3}, at: synchronize_rcu_expedited+0x4f2/0x610 kernel/rcu/tree_exp.h:836
> 1 lock held by syz-executor.4/8870:
>  #0: ffff888146ddcd88 (&xt[i].mutex){+.+.}-{3:3}, at: xt_find_table_lock+0x41/0x540 net/netfilter/x_tables.c:1206
> 1 lock held by syz-executor.0/8872:
>  #0: ffff888146ddcd88 (&xt[i].mutex){+.+.}-{3:3}, at: xt_find_table_lock+0x41/0x540 net/netfilter/x_tables.c:1206
> 1 lock held by syz-executor.1/8873:
>  #0: ffff888146ddcd88 (&xt[i].mutex){+.+.}-{3:3}, at: xt_find_table_lock+0x41/0x540 net/netfilter/x_tables.c:1206
> 
> =============================================
> 
> NMI backtrace for cpu 1
> CPU: 1 PID: 1647 Comm: khungtaskd Not tainted 5.11.0-rc1-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0x107/0x163 lib/dump_stack.c:120
>  nmi_cpu_backtrace.cold+0x44/0xd7 lib/nmi_backtrace.c:105
>  nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62
>  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
>  check_hung_uninterruptible_tasks kernel/hung_task.c:209 [inline]
>  watchdog+0xd43/0xfa0 kernel/hung_task.c:294
>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
> Sending NMI from CPU 1 to CPUs 0:
> NMI backtrace for cpu 0
> CPU: 0 PID: 8415 Comm: kworker/u4:6 Not tainted 5.11.0-rc1-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Workqueue: bat_events batadv_nc_worker
> RIP: 0010:__this_cpu_preempt_check+0xd/0x20 lib/smp_processor_id.c:70
> Code: 00 00 48 c7 c6 00 d9 9e 89 48 c7 c7 40 d9 9e 89 e9 98 fe ff ff 0f 1f 84 00 00 00 00 00 55 48 89 fd 0f 1f 44 00 00 48 89 ee 5d <48> c7 c7 80 d9 9e 89 e9 77 fe ff ff cc cc cc cc cc cc cc 0f 1f 44
> RSP: 0018:ffffc9000c507af0 EFLAGS: 00000046
> RAX: 0000000000000001 RBX: 0000000000000000 RCX: 1ffffffff1a077ab
> RDX: 0000000000000000 RSI: ffffffff894bac40 RDI: ffffffff894bac40
> RBP: ffffffff8b3739e0 R08: 0000000000000000 R09: ffffffff8d038b8f
> R10: fffffbfff1a07171 R11: 0000000000000000 R12: 0000000000000001
> R13: ffff88802f858bc0 R14: 00000000ffffffff R15: ffffffff889a5430
> FS:  0000000000000000(0000) GS:ffff8880b9e00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fbcc03ca000 CR3: 0000000011523000 CR4: 0000000000350ef0
> Call Trace:
>  lockdep_recursion_inc kernel/locking/lockdep.c:432 [inline]
>  lock_is_held_type+0x34/0x100 kernel/locking/lockdep.c:5475
>  lock_is_held include/linux/lockdep.h:271 [inline]
>  rcu_read_lock_sched_held+0x3a/0x70 kernel/rcu/update.c:123
>  trace_lock_release include/trace/events/lock.h:58 [inline]
>  lock_release+0x5b7/0x710 kernel/locking/lockdep.c:5448
>  __raw_spin_unlock_bh include/linux/spinlock_api_smp.h:174 [inline]
>  _raw_spin_unlock_bh+0x12/0x30 kernel/locking/spinlock.c:207
>  spin_unlock_bh include/linux/spinlock.h:399 [inline]
>  batadv_nc_purge_paths+0x2a5/0x3a0 net/batman-adv/network-coding.c:467
>  batadv_nc_worker+0x831/0xe50 net/batman-adv/network-coding.c:716
>  process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
>  worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
>  kthread+0x3b1/0x4a0 kernel/kthread.c:292
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
> 
> 
> Tested on:
> 
> commit:         a1235e44 io_uring: cancel all requests on task exit
> git tree:       git://git.kernel.dk/linux-block io_uring-5.11
> console output: https://syzkaller.appspot.com/x/log.txt?x=10c53584d00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c6b6b5cccb0f38f2
> dashboard link: https://syzkaller.appspot.com/bug?extid=2f5d1785dc624932da78
> compiler:       gcc (GCC) 10.1.0-syz 20200507
> 

-- 
Pavel Begunkov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ