lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0bc36797-fe4e-46ba-933d-0b3d508ed0dd@kernel.dk>
Date: Sun, 18 Jan 2026 11:34:15 -0700
From: Jens Axboe <axboe@...nel.dk>
To: Caleb Sander Mateos <csander@...estorage.com>,
 syzbot ci <syzbot+ci6d21afd0455de45a@...kaller.appspotmail.com>
Cc: io-uring@...r.kernel.org, joannelkoong@...il.com,
 linux-kernel@...r.kernel.org, oliver.sang@...el.com,
 syzbot@...kaller.appspotmail.com, syzbot@...ts.linux.dev,
 syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot ci] Re: io_uring: avoid uring_lock for
 IORING_SETUP_SINGLE_ISSUER

On 12/22/25 1:19 PM, Caleb Sander Mateos wrote:
> On Thu, Dec 18, 2025 at 3:01?AM syzbot ci
> <syzbot+ci6d21afd0455de45a@...kaller.appspotmail.com> wrote:
>>
>> syzbot ci has tested the following series
>>
>> [v6] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
>> https://lore.kernel.org/all/20251218024459.1083572-1-csander@purestorage.com
>> * [PATCH v6 1/6] io_uring: use release-acquire ordering for IORING_SETUP_R_DISABLED
>> * [PATCH v6 2/6] io_uring: clear IORING_SETUP_SINGLE_ISSUER for IORING_SETUP_SQPOLL
>> * [PATCH v6 3/6] io_uring: ensure submitter_task is valid for io_ring_ctx's lifetime
>> * [PATCH v6 4/6] io_uring: use io_ring_submit_lock() in io_iopoll_req_issued()
>> * [PATCH v6 5/6] io_uring: factor out uring_lock helpers
>> * [PATCH v6 6/6] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER
>>
>> and found the following issue:
>> INFO: task hung in io_wq_put_and_exit
>>
>> Full report is available here:
>> https://ci.syzbot.org/series/21eac721-670b-4f34-9696-66f9b28233ac
>>
>> ***
>>
>> INFO: task hung in io_wq_put_and_exit
>>
>> tree:      torvalds
>> URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
>> base:      d358e5254674b70f34c847715ca509e46eb81e6f
>> arch:      amd64
>> compiler:  Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
>> config:    https://ci.syzbot.org/builds/1710cffe-7d78-4489-9aa1-823b8c2532ed/config
>> syz repro: https://ci.syzbot.org/findings/74ae8703-9484-4d82-aa78-84cc37dcb1ef/syz_repro
>>
>> INFO: task syz.1.18:6046 blocked for more than 143 seconds.
>>       Not tainted syzkaller #0
>>       Blocked by coredump.
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:syz.1.18        state:D stack:25672 pid:6046  tgid:6045  ppid:5971   task_flags:0x400548 flags:0x00080004
>> Call Trace:
>>  <TASK>
>>  context_switch kernel/sched/core.c:5256 [inline]
>>  __schedule+0x14bc/0x5000 kernel/sched/core.c:6863
>>  __schedule_loop kernel/sched/core.c:6945 [inline]
>>  schedule+0x165/0x360 kernel/sched/core.c:6960
>>  schedule_timeout+0x9a/0x270 kernel/time/sleep_timeout.c:75
>>  do_wait_for_common kernel/sched/completion.c:100 [inline]
>>  __wait_for_common kernel/sched/completion.c:121 [inline]
>>  wait_for_common kernel/sched/completion.c:132 [inline]
>>  wait_for_completion+0x2bf/0x5d0 kernel/sched/completion.c:153
>>  io_wq_exit_workers io_uring/io-wq.c:1328 [inline]
>>  io_wq_put_and_exit+0x316/0x650 io_uring/io-wq.c:1356
>>  io_uring_clean_tctx+0x11f/0x1a0 io_uring/tctx.c:207
>>  io_uring_cancel_generic+0x6ca/0x7d0 io_uring/cancel.c:652
>>  io_uring_files_cancel include/linux/io_uring.h:19 [inline]
>>  do_exit+0x345/0x2310 kernel/exit.c:911
>>  do_group_exit+0x21c/0x2d0 kernel/exit.c:1112
>>  get_signal+0x1285/0x1340 kernel/signal.c:3034
>>  arch_do_signal_or_restart+0x9a/0x7a0 arch/x86/kernel/signal.c:337
>>  __exit_to_user_mode_loop kernel/entry/common.c:41 [inline]
>>  exit_to_user_mode_loop+0x87/0x4f0 kernel/entry/common.c:75
>>  __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
>>  syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
>>  syscall_exit_to_user_mode_work include/linux/entry-common.h:159 [inline]
>>  syscall_exit_to_user_mode include/linux/entry-common.h:194 [inline]
>>  do_syscall_64+0x2e3/0xf80 arch/x86/entry/syscall_64.c:100
>>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> RIP: 0033:0x7f6a8b58f7c9
>> RSP: 002b:00007f6a8c4a00e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
>> RAX: 0000000000000001 RBX: 00007f6a8b7e5fa8 RCX: 00007f6a8b58f7c9
>> RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 00007f6a8b7e5fac
>> RBP: 00007f6a8b7e5fa0 R08: 3fffffffffffffff R09: 0000000000000000
>> R10: 0000000000000800 R11: 0000000000000246 R12: 0000000000000000
>> R13: 00007f6a8b7e6038 R14: 00007ffcac96d220 R15: 00007ffcac96d308
>>  </TASK>
>> INFO: task iou-wrk-6046:6047 blocked for more than 143 seconds.
>>       Not tainted syzkaller #0
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:iou-wrk-6046    state:D stack:27760 pid:6047  tgid:6045  ppid:5971   task_flags:0x404050 flags:0x00080002
>> Call Trace:
>>  <TASK>
>>  context_switch kernel/sched/core.c:5256 [inline]
>>  __schedule+0x14bc/0x5000 kernel/sched/core.c:6863
>>  __schedule_loop kernel/sched/core.c:6945 [inline]
>>  schedule+0x165/0x360 kernel/sched/core.c:6960
>>  schedule_timeout+0x9a/0x270 kernel/time/sleep_timeout.c:75
>>  do_wait_for_common kernel/sched/completion.c:100 [inline]
>>  __wait_for_common kernel/sched/completion.c:121 [inline]
>>  wait_for_common kernel/sched/completion.c:132 [inline]
>>  wait_for_completion+0x2bf/0x5d0 kernel/sched/completion.c:153
>>  io_ring_ctx_lock_nested+0x2b3/0x380 io_uring/io_uring.h:283
>>  io_ring_ctx_lock io_uring/io_uring.h:290 [inline]
>>  io_ring_submit_lock io_uring/io_uring.h:554 [inline]
>>  io_files_update+0x677/0x7f0 io_uring/rsrc.c:504
>>  __io_issue_sqe+0x181/0x4b0 io_uring/io_uring.c:1818
>>  io_issue_sqe+0x1de/0x1190 io_uring/io_uring.c:1841
>>  io_wq_submit_work+0x6e9/0xb90 io_uring/io_uring.c:1953
>>  io_worker_handle_work+0x7cd/0x1180 io_uring/io-wq.c:650
>>  io_wq_worker+0x42f/0xeb0 io_uring/io-wq.c:704
>>  ret_from_fork+0x599/0xb30 arch/x86/kernel/process.c:158
>>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
>>  </TASK>
> 
> Interesting, a deadlock between io_wq_exit_workers() on submitter_task
> (which is exiting) and io_ring_ctx_lock() on an io_uring worker
> thread. io_ring_ctx_lock() is blocked until submitter_task runs task
> work, but that will never happen because it's waiting on the
> completion. Not sure what the best approach is here. Maybe have the
> submitter_task alternate between running task work and waiting on the
> completion? Or have some way for submitter_task to indicate that it's
> exiting and disable the IORING_SETUP_SINGLE_ISSUER optimization in
> io_ring_ctx_lock()?

Finally got around to taking a look at this patchset today, and it does
look sound to me. For cases that have zero expected io-wq activity, then
it seems like a no-brainer. For cases that have a lot of expected io-wq
activity, which are basically only things like fs/storage workloads on
suboptimal configurations, the then the suspend/resume mechanism may be
troublesome. But not quite sure what to do about that, or if it's evne
noticable?

For the case in question, yes I think we'll need the completion wait
cases to break for running task_work.

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ