lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <85f96aab-4127-f494-9718-d7bfc035db54@gmail.com>
Date:   Fri, 22 Oct 2021 14:49:36 +0100
From:   Pavel Begunkov <asml.silence@...il.com>
To:     syzbot <syzbot+27d62ee6f256b186883e@...kaller.appspotmail.com>,
        axboe@...nel.dk, io-uring@...r.kernel.org,
        linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] INFO: task hung in io_wqe_worker

On 10/22/21 05:38, syzbot wrote:
> Hello,
> 
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> INFO: task hung in io_wqe_worker
> 
> INFO: task iou-wrk-9392:9401 blocked for more than 143 seconds.
>        Not tainted 5.15.0-rc2-syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:iou-wrk-9392    state:D stack:27952 pid: 9401 ppid:  7038 flags:0x00004004
> Call Trace:
>   context_switch kernel/sched/core.c:4940 [inline]
>   __schedule+0xb44/0x5960 kernel/sched/core.c:6287
>   schedule+0xd3/0x270 kernel/sched/core.c:6366
>   schedule_timeout+0x1db/0x2a0 kernel/time/timer.c:1857
>   do_wait_for_common kernel/sched/completion.c:85 [inline]
>   __wait_for_common kernel/sched/completion.c:106 [inline]
>   wait_for_common kernel/sched/completion.c:117 [inline]
>   wait_for_completion+0x176/0x280 kernel/sched/completion.c:138
>   io_worker_exit fs/io-wq.c:183 [inline]
>   io_wqe_worker+0x66d/0xc40 fs/io-wq.c:597
>   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Easily reproducible, it's stuck in

static void io_worker_exit(struct io_worker *worker)
{
	...
	wait_for_completion(&worker->ref_done);
	...
}

The reference belongs to a create_worker_cb() task_work item. It's expected
to either be executed or cancelled by io_wq_exit_workers(), but the owner
task never goes __io_uring_cancel (called in do_exit()) and so never
reaches io_wq_exit_workers().

Following the owner task, cat /proc/<pid>/stack:

[<0>] do_coredump+0x1d0/0x10e0
[<0>] get_signal+0x4a3/0x960
[<0>] arch_do_signal_or_restart+0xc3/0x6d0
[<0>] exit_to_user_mode_prepare+0x10e/0x190
[<0>] irqentry_exit_to_user_mode+0x9/0x20
[<0>] irqentry_exit+0x36/0x40
[<0>] exc_page_fault+0x95/0x190
[<0>] asm_exc_page_fault+0x1e/0x30

(gdb) l *(do_coredump+0x1d0-5)
0xffffffff81343ccb is in do_coredump (fs/coredump.c:469).
464
465             if (core_waiters > 0) {
466                     struct core_thread *ptr;
467
468                     freezer_do_not_count();
469                     wait_for_completion(&core_state->startup);
470                     freezer_count();

Can't say anything more at the moment as not familiar with coredump

-- 
Pavel Begunkov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ