lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120815061115.GC32347@cucamonga.audible.transient.net>
Date:	Wed, 15 Aug 2012 06:11:15 +0000
From:	Jamie Heilman <jamie@...ible.transient.net>
To:	"J. Bruce Fields" <bfields@...ldses.org>
Cc:	linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to return

I really wish I could have nailed this down better, but I've had a
hard time reliably reproducing the problem during bisection, and I
haven't seen anyone report a similar sounding problem.  Here's what
I've seen: since 3.5 I've been having spurious delays on my nfs
clients, noticable particularly when I open an mbox file in mutt over
an nfs v4 mount from a v3.5 or later server.  The servers I've
reproduced this on are all uni-proc 32-bit systems... but then I
haven't tried SMP or 64-bit systems yet, it may or may not exist
there.  When the delay occurs, it's quite noticable.  I've never seen
one that takes less than 40 seconds to "unstick."  I wrote a quick and
dirty reproduction tool, based on the syscalls mutt was doing that
triggered the problem, attached to this message.  To use it, compile
the file as utime-test on an exported volume, then execute with (cd
/some/mount/point && strace -T ./utime-test) from a nfs4 client.

For whatever, reason I frequently find the second call to utime takes
an irritatingly long time to return and I see something like:
utime("utime-test.c", [2012/08/14-22:47:21, 2012/08/14-17:25:21]) = 0 <70.510913>
in the strace output.

I've reproduced this on Debian Squeeze / nfs-utils 1.2.2 based servers
(legacy idmapper, no user-space nfsidmap), as well as Debian Wheezy /
nfs-utils 1.2.6 (uses keyutils upcalls) servers, so I doubt it's a
user-space related issue...  Attempts to bisect have been muddled,
I'll keep trying in the interim, but the best I've been able to pin
things down is that issue was probably introduced in the
419f4319495043a9507ac3e616be9ca60af09744 merge.  I can't repo on a
kernel based on fb21affa49204acd409328415b49bfe90136653c.  (I say
based on, because I have to apply the patch from
http://marc.info/?l=linux-nfs&m=133950479803025 or face additional
problems.)

I'll try to get full rcpdebug traces on client and server as the delay
is occuring in the hopes that helps pin things down, and post them
separately.

-- 
Jamie Heilman                     http://audible.transient.net/~jamie/

View attachment "utime-test.c" of type "text/x-csrc" (697 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ