linux-kernel - Re: INFO: task hung in __io_uring_task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <e3bd8214-830d-ab3e-57ba-564c5adcac52@kernel.dk>
Date:   Sun, 3 Jan 2021 14:53:35 -0700
From:   Jens Axboe <axboe@...nel.dk>
To:     Palash Oswal <oswalpalash@...il.com>, io-uring@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        mingo@...nel.org, mingo@...hat.com, peterz@...radead.org,
        rostedt@...dmis.org, syzkaller-bugs@...glegroups.com,
        viro@...iv.linux.org.uk, will@...nel.org
Subject: Re: INFO: task hung in __io_uring_task_cancel

On 1/2/21 9:14 PM, Palash Oswal wrote:
>  Hello,
> 
> I was running syzkaller and I found the following issue :
> 
> Head Commit : b1313fe517ca3703119dcc99ef3bbf75ab42bcfb ( v5.10.4 )
> Git Tree : stable
> Console Output :
> [  242.769080] INFO: task repro:2639 blocked for more than 120 seconds.
> [  242.769096]       Not tainted 5.10.4 #8
> [  242.769103] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  242.769112] task:repro           state:D stack:    0 pid: 2639
> ppid:  2638 flags:0x00000004
> [  242.769126] Call Trace:
> [  242.769148]  __schedule+0x28d/0x7e0
> [  242.769162]  ? __percpu_counter_sum+0x75/0x90
> [  242.769175]  schedule+0x4f/0xc0
> [  242.769187]  __io_uring_task_cancel+0xad/0xf0
> [  242.769198]  ? wait_woken+0x80/0x80
> [  242.769210]  bprm_execve+0x67/0x8a0
> [  242.769223]  do_execveat_common+0x1d2/0x220
> [  242.769235]  __x64_sys_execveat+0x5d/0x70
> [  242.769249]  do_syscall_64+0x38/0x90
> [  242.769260]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  242.769270] RIP: 0033:0x7f59ce45967d
> [  242.769277] RSP: 002b:00007ffd05d10a58 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000142
> [  242.769290] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f59ce45967d
> [  242.769297] RDX: 0000000000000000 RSI: 0000000020000180 RDI: 00000000ffffffff
> [  242.769304] RBP: 00007ffd05d10a70 R08: 0000000000000000 R09: 00007ffd05d10a70
> [  242.769311] R10: 0000000000000000 R11: 0000000000000246 R12: 000055a91d37d320
> [  242.769318] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

Can you see if this helps? The reproducer is pretty brutal, it'll fork
thousands of tasks with rings! But should work of course. I think this
one is pretty straight forward, and actually an older issue with the
poll rewaiting.

diff --git a/fs/io_uring.c b/fs/io_uring.c
index ca46f314640b..539de04f9183 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5103,6 +5103,12 @@ static bool io_poll_rewait(struct io_kiocb *req, struct io_poll_iocb *poll)
 {
 	struct io_ring_ctx *ctx = req->ctx;
 
+	/* Never re-wait on poll if the ctx or task is going away */
+	if (percpu_ref_is_dying(&ctx->refs) || req->task->flags & PF_EXITING) {
+		spin_lock_irq(&ctx->completion_lock);
+		return false;
+	}
+
 	if (!req->result && !READ_ONCE(poll->canceled)) {
 		struct poll_table_struct pt = { ._key = poll->events };
 

-- 
Jens Axboe