lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 12 Nov 2018 04:38:34 +0000 (UTC)
From:   Kevin Liu <kevin@...atofrom.space>
To:     linux-ext4@...r.kernel.org
Cc:     kevin@...atofrom.space, bfields@...ldses.org
Subject: ext4-nfsd interaction causes sporadic hang on rwsem_down_write_failed

Hi,

I recently submitted an NFS bug
(https://bugzilla.kernel.org/show_bug.cgi?id=201655) where nfsd randomly
locks up on rwsem_down_write_failed:

Nov 10 15:29:55 rem kernel: INFO: task nfsd:7464 blocked for more than
120 seconds.
Nov 10 15:29:55 rem kernel:       Tainted: P           O      4.19.1
#1-NixOS
Nov 10 15:29:55 rem kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 10 15:29:55 rem kernel: nfsd            D    0  7464      2 0x80000000
Nov 10 15:29:55 rem kernel: Call Trace:
Nov 10 15:29:55 rem kernel:  ? __schedule+0x1d3/0x6f0
Nov 10 15:29:55 rem kernel:  schedule+0x28/0x80
Nov 10 15:29:55 rem kernel:  rwsem_down_write_failed+0x15e/0x350
Nov 10 15:29:55 rem kernel:  ? call_rwsem_down_write_failed+0x13/0x20
Nov 10 15:29:55 rem kernel:  call_rwsem_down_write_failed+0x13/0x20
Nov 10 15:29:55 rem kernel:  down_write+0x29/0x40
Nov 10 15:29:55 rem kernel:  ext4_file_write_iter+0x91/0x3d0 [ext4]
Nov 10 15:29:55 rem kernel:  ? nfsd_proc_write+0x160/0x160 [nfsd]
Nov 10 15:29:55 rem kernel:  ? exportfs_decode_fh+0xf2/0x2b0
...

(more details in the bugzilla)

And according to Bruce Fields:

> I'm guessing it's the inode_lock at the start of ext4_file_write_iter that's blocking.  On a quick look I don't see any of the callers taking any locks.  So I'd expect that elsewhere there'd be a process holding that inode lock and blocking on something else.
> 
> Based just on this might first guess would be a vfs or maybe ext4 bug rather than an nfsd bug, but I'm not seeing how to reassign.  May be worth reporting to the relevant mailing lists.

So, starting with ext4, I was wondering if you had an idea of what the
cause might be or where the fault truly lies.

Kevin Liu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ