lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK_92-HRCUAg6=ATiGT0Cbh+Qkhk9mHBUXje9z9utosEC7e0cw@mail.gmail.com>
Date:	Thu, 1 Oct 2015 16:45:45 +0200
From:	"sascha a." <sascha.arthur@...il.com>
To:	Austin S Hemmelgarn <ahferroin7@...il.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: NFS / FuseFS Kernel Hangs Bug

Hello,

Okay, i was wrong with FUSE and NFS thanks for the hint.

About the Problem:
Without digging deep into the kernel sources, your explaination is
more or less that was i thinking about whats happening.
Anyways, the reason why i report the Problem is that during this 120
Seconds (until the Kernel solves this issue by killing (?) the
process) the system is unusable.

What i mean about it:
Its not even possible to ssh on the server, even if /root and /home is
local and should not be affected by the slow NFS Servers.
Also it seems during this period a lot of network connections drop/freeze(?).

Youre completly right when you says, theres no other way/its by design
to wait for the NFS-Response. But in my point of view this 'wait' is
happening on the wrong security level. If im not wrong the current
implementation blocks/hangs tasks in kernelspace, or at least blocks
the scheduler during this period.

2015-10-01 16:24 GMT+02:00 Austin S Hemmelgarn <ahferroin7@...il.com>:
> On 2015-10-01 09:06, sascha a. wrote:
>>
>> Hello,
>>
>>
>> I want to report a Bug with NFS / FuseFS.
>>
>> Theres trouble with mounting a NFS FS with FuseFS, if the NFS Server
>> is slowly responding.
>>
>> The problem occurs, if you mount a NFS FS with FuseFS driver for
>> example with this command:
>>
>> mount -t nfs -o vers=3,nfsvers=3,hard,intr,tcp server /dest
>>
>> Working on this nfs overlay works like a charm, as long as the NFS
>> Server is not under heavy load. If it gets under HEAVY load from time
>> to time the kernel hangs (which should in my opinion never ever
>> occur).
>
> OK, before I start on an explanation of why what is happening is happening,
> I should note that unless you're using some special FUSE driver instead of
> the regular NFS tools, you're not using FUSE to mount the NFS share, you're
> using a regular kernel driver.
>
> Now, on to the explanation:
> This behavior is expected and unavoidable for any network filesystem under
> the described conditions.  Sync (or any other command that causes access to
> the filesystem that isn't served by the local cache) requires sending a
> command to the server.  Sync in particular is _synchronous_ (and it should
> be, otherwise you break the implied data safety from using it), which means
> that it will wait until it gets a reply from the server before it returns,
> which means that if the server is heavily loaded (or just ridiculously
> slow), it will be a while before it returns.  On top of this, depending on
> how the server is caching data, it may take a long time to return even on a
> really fast server with no other load.
>
> The stacktrace you posted indicates simply that the kernel noticed that
> 'sync' was in an I/O sleep state (the 'D state' it refers to) for more than
> 120 seconds, which is the default detection timeout for this.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ