lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <172524227511.4433.7227683124049217481@noble.neil.brown.name>
Date: Mon, 02 Sep 2024 11:57:55 +1000
From: "NeilBrown" <neilb@...e.de>
To: "syzbot" <syzbot+d1e76d963f757db40f91@...kaller.appspotmail.com>
Cc: Dai.Ngo@...cle.com, chuck.lever@...cle.com, dai.ngo@...cle.com,
 jlayton@...nel.org, kolga@...app.com, linux-kernel@...r.kernel.org,
 linux-nfs@...r.kernel.org, lorenzo@...nel.org, netdev@...r.kernel.org,
 okorniev@...hat.com, syzkaller-bugs@...glegroups.com, tom@...pey.com
Subject: Re: [syzbot] [nfs?] INFO: task hung in nfsd_nl_listener_set_doit

On Sun, 01 Sep 2024, syzbot wrote:
> syzbot has found a reproducer for the following issue on:

I had a poke around using the provided disk image and kernel for
exploring.

I think the problem is demonstrated by this stack :

[<0>] rpc_wait_bit_killable+0x1b/0x160
[<0>] __rpc_execute+0x723/0x1460
[<0>] rpc_execute+0x1ec/0x3f0
[<0>] rpc_run_task+0x562/0x6c0
[<0>] rpc_call_sync+0x197/0x2e0
[<0>] rpcb_register+0x36b/0x670
[<0>] svc_unregister+0x208/0x730
[<0>] svc_bind+0x1bb/0x1e0
[<0>] nfsd_create_serv+0x3f0/0x760
[<0>] nfsd_nl_listener_set_doit+0x135/0x1a90
[<0>] genl_rcv_msg+0xb16/0xec0
[<0>] netlink_rcv_skb+0x1e5/0x430

No rpcbind is running on this host so that "svc_unregister" takes a
long time.  Maybe not forever but if a few of these get queued up all
blocking some other thread, then maybe that pushed it over the limit.

The fact that rpcbind is not running might not be relevant as the test
messes up the network.  "ping 127.0.0.1" stops working.

So this bug comes down to "we try to contact rpcbind while holding a
mutex and if that gets no response and no error, then we can hold the
mutex for a long time".

Are we surprised? Do we want to fix this?  Any suggestions how?

NeilBrown

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ