[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <727110bb-0154-e5df-4b2f-e965e3b98c62@i-love.sakura.ne.jp>
Date: Wed, 18 Jul 2018 23:17:54 +0900
From: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
To: Dmitry Vyukov <dvyukov@...gle.com>
Cc: Eric Van Hensbergen <ericvh@...il.com>,
Ron Minnich <rminnich@...dia.gov>,
Latchesar Ionkov <lucho@...kov.net>,
v9fs-developer@...ts.sourceforge.net,
syzbot <syzbot+f425456ea8aa16b40d20@...kaller.appspotmail.com>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
Al Viro <viro@...iv.linux.org.uk>
Subject: Re: INFO: task hung in grab_super
On 2018/07/18 23:11, Dmitry Vyukov wrote:
> On Wed, Jul 18, 2018 at 3:35 PM, Tetsuo Handa
> <penguin-kernel@...ove.sakura.ne.jp> wrote:
>>>>> This seems to be related to 9p. After rerunning the log I got:
>>>>>
>>>>> root@...kaller:~# ps afxu | grep syz
>>>>> root 18253 0.0 0.0 0 0 ttyS0 Zl 10:16 0:00 \_
>>>>> [syz-executor] <defunct>
>>>>> root@...kaller:~# cat /proc/18253/task/*/stack
>>>>> [<0>] p9_client_rpc+0x3a2/0x1400
>>>>> [<0>] p9_client_flush+0x134/0x2a0
>>>>> [<0>] p9_client_rpc+0x122c/0x1400
>>>>> [<0>] p9_client_create+0xc56/0x16af
>>>>> [<0>] v9fs_session_init+0x21a/0x1a80
>>>>> [<0>] v9fs_mount+0x7c/0x900
>>>>> [<0>] mount_fs+0xae/0x328
>>>>> [<0>] vfs_kern_mount.part.34+0xdc/0x4e0
>>>>> [<0>] do_mount+0x581/0x30e0
>>>>> [<0>] ksys_mount+0x12d/0x140
>>>>> [<0>] __x64_sys_mount+0xbe/0x150
>>>>> [<0>] do_syscall_64+0x1b9/0x820
>>>>> [<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>>> [<0>] 0xffffffffffffffff
>>>>>
>>>>> There is a bunch of hangs in 9p, so let's do:
>>>>>
>>>>> #syz dup: INFO: task hung in flush_work
>>>>>
>>>> Then, is dumping all threads when khungtaskd fires a candidate
>>>> for CONFIG_DEBUG_AID_FOR_SYZBOT=y path?
>>>
>>> Perhaps would be useful. But maybe only tasks that are blocked for
>>> more than timeout/2? and/or unkillable tasks? killable tasks are not a
>>> problem.
>>
>> TASK_KILLABLE waiters are not reported by khungtaskd, are they?
>>
>> /* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
>> if (t->state == TASK_UNINTERRUPTIBLE)
>> check_hung_task(t, timeout);
>>
>> And TASK_KILLABLE waiters can become a problem because
>>
>>>
>>> Btw, I see that p9_client_rpc uses wait_event_killable, why wasn't it
>>> killed along with the whole process?
>>>
>>
>> wait_event_killable() would return -ERESTARTSYS if got SIGKILL.
>> But if (c->status == Connected) && (type == P9_TFLUSH) is also true,
>> it ignores SIGKILL by retrying the loop...
>>
>> again:
>> err = wait_event_killable(*req->wq, req->status >= REQ_STATUS_RCVD);
>> if ((err == -ERESTARTSYS) && (c->status == Connected) && (type == P9_TFLUSH)) {
>> sigpending = 1;
>> clear_thread_flag(TIF_SIGPENDING);
>> goto again;
>> }
>>
>> I wish they don't ignore SIGKILL (by e.g. offloading operations to a kernel thread).
>
>
> I guess that's the problem, right? SIGKILL-ed task must not ignore
> SIGKILL and hang in infinite loop. This would explain a bunch of hangs
> in 9p.
Did you check /proc/18253/task/*/stack after manually sending SIGKILL?
I mean, who (i.e. you or syzkaller programs) is sending a signal (not limited
to SIGKILL but any signal) that makes TASK_KILLABLE waiters to wake up?
Powered by blists - more mailing lists