linux-kernel - [Syzkaller & bisect] There is "io_ring_exit_work" related Call Trace in v5.2-rc5 kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y9TgUupO5C39V/DW@xpf.sh.intel.com>
Date:   Sat, 28 Jan 2023 16:44:02 +0800
From:   Pengfei Xu <pengfei.xu@...el.com>
To:     <asml.silence@...il.com>, <linux-kernel@...r.kernel.org>
CC:     <heng.su@...el.com>, <pengfei.xu@...el.com>, <axboe@...nel.dk>
Subject: [Syzkaller & bisect] There is "io_ring_exit_work" related Call Trace
 in v5.2-rc5 kernel

Hi Pavel Begunkov and kernel expert,

Greeting!

There is "io_ring_exit_work" related Call Trace in v5.2-rc5 kernel in guest
on Sapphire Rapids server.

INFO: task kworker/u4:2:32 blocked for more than 147 seconds.
[  300.212512]       Not tainted 6.2.0-rc5-2241ab53cbb5 #1
[  300.213087] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  300.213921] task:kworker/u4:2    state:D stack:0     pid:32    ppid:2      flags:0x00004000
[  300.214849] Workqueue: events_unbound io_ring_exit_work
[  300.215506] Call Trace:
[  300.215789]  <TASK>
[  300.216051]  __schedule+0x385/0xb00
[  300.216467]  ? write_comp_data+0x2f/0x90
[  300.216949]  ? wait_for_completion+0x7b/0x180
[  300.217445]  schedule+0x5b/0xe0
[  300.217836]  schedule_timeout+0x561/0x650
[  300.218301]  ? _raw_spin_unlock_irq+0x2b/0x60
[  300.218816]  ? wait_for_completion+0x7b/0x180
[  300.219341]  ? __this_cpu_preempt_check+0x20/0x30
[  300.219893]  ? lockdep_hardirqs_on+0x8a/0x110
[  300.220384]  ? _raw_spin_unlock_irq+0x2b/0x60
[  300.220881]  ? wait_for_completion+0x9e/0x180
[  300.221382]  ? trace_hardirqs_on+0x3f/0x100
[  300.221868]  ? wait_for_completion+0x7b/0x180
[  300.222379]  wait_for_completion+0xa6/0x180
[  300.222897]  io_ring_exit_work+0x2f7/0x747
[  300.223383]  ? __pfx_io_tctx_exit_cb+0x10/0x10
[  300.223947]  process_one_work+0x3b1/0x960
[  300.224446]  worker_thread+0x52/0x660
[  300.224884]  ? __pfx_worker_thread+0x10/0x10
[  300.225376]  kthread+0x161/0x1a0
[  300.225782]  ? __pfx_kthread+0x10/0x10
[  300.226225]  ret_from_fork+0x29/0x50
[  300.226688]  </TASK>
[  300.227791] 
[  300.227791] Showing all locks held in the system:
[  300.228501] 1 lock held by rcu_tasks_kthre/11:
[  300.229003]  #0: ffffffff83d617b0 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x2d/0x3d0
[  300.230098] 1 lock held by rcu_tasks_rude_/12:
[  300.230601]  #0: ffffffff83d61530 (rcu_tasks_rude.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x2d/0x3d0
[  300.231733] 1 lock held by rcu_tasks_trace/13:
[  300.232244]  #0: ffffffff83d61270 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x2d/0x3d0
[  300.233377] 1 lock held by khungtaskd/29:
[  300.233842]  #0: ffffffff83d621c0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x27/0x18c
[  300.234866] 2 locks held by kworker/u4:2/32:
[  300.235351]  #0: ff11000006045d38 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x2ff/0x960
[  300.236502]  #1: ffa0000000117e60 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x303/0x960
[  300.237696] 
[  300.237882] =============================================
[  300.237882] 

Bisected and found the first bad commit is:
eebd2e37e662617a6b8041db75205f0a262ce870
io_uring: don't take task ring-file notes

But after reverted above commit eebd2e37e on top of v6.2-rc5 kernel and made
the kernel failed. So it's just the suspected commit for above problem.

Syzkaller reproduced code, bisect info, kconfig and v6.2-rc5 dmesg are in
attached.

All detailed info is in link:
https://github.com/xupengfe/syzkaller_logs/tree/main/230124_055801_io_ring_exit_work

I hope it's helpful.

Thanks!
BR.

View attachment "repro.c" of type "text/plain" (19895 bytes)

View attachment "bisect_info.log" of type "text/plain" (4102 bytes)

View attachment "kconfig" of type "text/plain" (278960 bytes)

View attachment "2241ab53cbb5cdb08a6b2d4688feb13971058f65_dmesg.log" of type "text/plain" (44208 bytes)