lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 1 Oct 2015 10:24:18 -0400
From:	Austin S Hemmelgarn <ahferroin7@...il.com>
To:	"sascha a." <sascha.arthur@...il.com>, linux-kernel@...r.kernel.org
Subject: Re: NFS / FuseFS Kernel Hangs Bug

On 2015-10-01 09:06, sascha a. wrote:
> Hello,
>
>
> I want to report a Bug with NFS / FuseFS.
>
> Theres trouble with mounting a NFS FS with FuseFS, if the NFS Server
> is slowly responding.
>
> The problem occurs, if you mount a NFS FS with FuseFS driver for
> example with this command:
>
> mount -t nfs -o vers=3,nfsvers=3,hard,intr,tcp server /dest
>
> Working on this nfs overlay works like a charm, as long as the NFS
> Server is not under heavy load. If it gets under HEAVY load from time
> to time the kernel hangs (which should in my opinion never ever
> occur).
OK, before I start on an explanation of why what is happening is 
happening, I should note that unless you're using some special FUSE 
driver instead of the regular NFS tools, you're not using FUSE to mount 
the NFS share, you're using a regular kernel driver.

Now, on to the explanation:
This behavior is expected and unavoidable for any network filesystem 
under the described conditions.  Sync (or any other command that causes 
access to the filesystem that isn't served by the local cache) requires 
sending a command to the server.  Sync in particular is _synchronous_ 
(and it should be, otherwise you break the implied data safety from 
using it), which means that it will wait until it gets a reply from the 
server before it returns, which means that if the server is heavily 
loaded (or just ridiculously slow), it will be a while before it 
returns.  On top of this, depending on how the server is caching data, 
it may take a long time to return even on a really fast server with no 
other load.

The stacktrace you posted indicates simply that the kernel noticed that 
'sync' was in an I/O sleep state (the 'D state' it refers to) for more 
than 120 seconds, which is the default detection timeout for this.


Download attachment "smime.p7s" of type "application/pkcs7-signature" (3019 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ