[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <172039726840.11489.12386749198888516742@noble.neil.brown.name>
Date: Mon, 08 Jul 2024 10:07:48 +1000
From: "NeilBrown" <neilb@...e.de>
To: "syzbot" <syzbot+b568ba42c85a332a88ee@...kaller.appspotmail.com>
Cc: "Jeff Layton" <jlayton@...nel.org>, Dai.Ngo@...cle.com,
chuck.lever@...cle.com, kolga@...app.com, linux-kernel@...r.kernel.org,
linux-nfs@...r.kernel.org, syzkaller-bugs@...glegroups.com, tom@...pey.com
Subject: Re: [syzbot] [nfs?] INFO: task hung in nfsd_umount
On Sun, 07 Jul 2024, Jeff Layton wrote:
> On Sat, 2024-07-06 at 21:37 -0700, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 1dd28064d416 Merge tag 'integrity-v6.10-fix' of ssh://ra.k..
...
> > 2 locks held by syz.4.2691/12623:
> > #0: ffffffff8f63ad70 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1218
> > #1: ffffffff8e5fff48 (nfsd_mutex){+.+.}-{3:3}, at: nfsd_nl_listener_set_doit+0x12d/0x1a90 fs/nfsd/nfsctl.c:1940
> > 2 locks held by syz.0.2871/13167:
> > #0: ffff88804e1920e0 (&type->s_umount_key#77){++++}-{3:3}, at: __super_lock fs/super.c:56 [inline]
> > #0: ffff88804e1920e0 (&type->s_umount_key#77){++++}-{3:3}, at: __super_lock_excl fs/super.c:71 [inline]
> > #0: ffff88804e1920e0 (&type->s_umount_key#77){++++}-{3:3}, at: deactivate_super+0xb5/0xf0 fs/super.c:505
> > #1: ffffffff8e5fff48 (nfsd_mutex){+.+.}-{3:3}, at: nfsd_shutdown_threads+0x4e/0xd0 fs/nfsd/nfssvc.c:632
>
> Above there are two tasks that are holding/blocked on the nfsd_mutex.
> The first is the stuck thread. The second is a task that took it in
> nfsd_nl_listener_set_doit. I don't see a way to exit that function
> without releasing that mutex, so it seems likely that thread is stuck
> too for some reason. Unfortunately, we don't have a stack trace from
> that task, so I can't tell what it's doing.
>
We can guess though. It isn't waiting for a lock - that would show in
the above list - so it might be waiting for a wakeup, or might be
spinning.
The only wake-up I can imagine is in one of the memory-allocation calls,
but if the system were running out of memory we would probably see
messages about that.
I wonder if it could be looping in svc_xprt_destroy_all(), and sitting
in the msleep() when the hang is detected so there are no locks to
report. I can't see while it would block there.
It would really help to get a full task list.
There is a sysctl for that: /proc/sys/kernel/hung_task_all_cpu_backtrace
Could that be enabled?
NeilBrown
Powered by blists - more mailing lists