lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1219400624.18774.67.camel@zakaz.uk.xensource.com>
Date:	Fri, 22 Aug 2008 11:23:43 +0100
From:	Ian Campbell <ijc@...lion.org.uk>
To:	Trond Myklebust <trond.myklebust@....uio.no>
Cc:	Grant Coady <gcoady.lk@...il.com>, linux-kernel@...r.kernel.org,
	neilb@...e.de, bfields@...ldses.org, linux-nfs@...r.kernel.org
Subject: Re: NFS regression?  Odd delays and lockups accessing an NFS
	export.

On Mon, 2008-08-18 at 15:20 -0400, Trond Myklebust wrote:
> Please try to reproduce the hang, then do
> 
> 	echo 0 >/proc/sys/sunrpc/rpc_debug
> 
> and send the output from 'dmesg'...

I've also been seeing some NFS related lockups, although I'm not sure if
they are the same as the one in this thread or not. Client is 2.6.26
(Debian's kernel) and Server is 2.6.25 (also a Debian kernel, but from
backports.org).

rpc_debug on the server gives nothing, on the client gives:
[144741.637997] -pid- proc flgs status -client- -prog- --rqstp- -timeout -rpcwait -action- ---ops--
[144741.637997]  3439 0004 0080    -11 f3f48200 100003 f7770000        0 xprt_sending fa0ae88e fa0bddf4
[144741.637997]  3438 0001 00a0      0 f77f2a00 100003 f77700d0    15000 xprt_pending fa0ae88e fa0bddf4

There are no processes running with pid 3439 3438 (I don't think it's
that sort of pid though).

mounts points are:
hopkins:/storage/music /storage/music nfs rw,nosuid,vers=3,rsize=32768,wsize=32768,namlen=255,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,mountproto=udp,addr=192.168.1.6 0 0
hopkins:/storage/mythtv/recordings /var/lib/mythtv/recordings nfs rw,nosuid,vers=3,rsize=32768,wsize=32768,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,mountproto=udp,addr=192.168.1.6 0 0
hopkins:/var/lib/mythvideo /var/lib/mythvideo nfs rw,nosuid,vers=3,rsize=32768,wsize=32768,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,mountproto=udp,addr=192.168.1.6 0 0
hopkins:/storage/home/ijc /home/ijc nfs rw,vers=3,rsize=131072,wsize=131072,namlen=255,hard,nointr,proto=tcp,timeo=600,retrans=2,sec=sys,mountproto=udp,addr=192.168.1.6 0 0
and all seem to be effected.

It hasn't happened this time (yet) but usually I get a hung task
backtrace like this:
Aug  4 06:27:28 iranon kernel: [137969.382277] INFO: task mythbackend:3161 blocked for more than 120 seconds.
Aug  4 06:27:28 iranon kernel: [137969.382287] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug  4 06:27:28 iranon kernel: [137969.382291] mythbackend   D 00005dfc     0  3161      1
Aug  4 06:27:28 iranon kernel: [137969.382295]        f2006c70 00000082 4b05cc7e 00005dfc f2006df0 c1c0e920 00000000 f7273390 
Aug  4 06:27:28 iranon kernel: [137969.382301]        0000c8b4 00000000 00000001 f7273398 00000282 00000202 f71b0ab0 f200ddf0 
Aug  4 06:27:28 iranon kernel: [137969.382306]        c1c012bc fa18f0a6 fa18f0bf c02b45a7 f71b0ab0 00000000 f200de0c fa18f0a6 
Aug  4 06:27:28 iranon kernel: [137969.382311] Call Trace:
Aug  4 06:27:28 iranon kernel: [137969.382347]  [<fa18f0a6>] nfs_wait_schedule+0x0/0x1e [nfs]
Aug  4 06:27:28 iranon kernel: [137969.382384]  [<fa18f0bf>] nfs_wait_schedule+0x19/0x1e [nfs]
Aug  4 06:27:28 iranon kernel: [137969.382399]  [<c02b45a7>] __wait_on_bit_lock+0x2a/0x52
Aug  4 06:27:28 iranon kernel: [137969.382407]  [<fa18f0a6>] nfs_wait_schedule+0x0/0x1e [nfs]
Aug  4 06:27:28 iranon kernel: [137969.382421]  [<c02b462e>] out_of_line_wait_on_bit_lock+0x5f/0x67
Aug  4 06:27:28 iranon kernel: [137969.382429]  [<c0133a1d>] wake_bit_function+0x0/0x3c
Aug  4 06:27:28 iranon kernel: [137969.382441]  [<fa18f16e>] __nfs_revalidate_inode+0xaa/0x211 [nfs]
Aug  4 06:27:28 iranon kernel: [137969.382458]  [<c017bac0>] do_lookup+0x53/0x145
Aug  4 06:27:28 iranon kernel: [137969.382466]  [<c0187609>] mntput_no_expire+0x11/0x64
Aug  4 06:27:28 iranon kernel: [137969.382472]  [<c017d971>] __link_path_walk+0xa71/0xb65
Aug  4 06:27:28 iranon kernel: [137969.382477]  [<c017bac0>] do_lookup+0x53/0x145
Aug  4 06:27:28 iranon kernel: [137969.382483]  [<c0187609>] mntput_no_expire+0x11/0x64
Aug  4 06:27:28 iranon kernel: [137969.382492]  [<c0187609>] mntput_no_expire+0x11/0x64
Aug  4 06:27:28 iranon kernel: [137969.382496]  [<c017daf5>] path_walk+0x90/0x98
Aug  4 06:27:28 iranon kernel: [137969.382502]  [<fa18f935>] nfs_getattr+0x8f/0xbe [nfs]
Aug  4 06:27:28 iranon kernel: [137969.382520]  [<fa18f8a6>] nfs_getattr+0x0/0xbe [nfs]
Aug  4 06:27:28 iranon kernel: [137969.382536]  [<c01785ac>] vfs_getattr+0x36/0x4d
Aug  4 06:27:28 iranon kernel: [137969.382545]  [<c0178684>] vfs_lstat_fd+0x27/0x39
Aug  4 06:27:28 iranon kernel: [137969.382550]  [<fa18bca6>] nfs_permission+0x0/0x129 [nfs]
Aug  4 06:27:28 iranon kernel: [137969.382567]  [<c0187609>] mntput_no_expire+0x11/0x64
Aug  4 06:27:28 iranon kernel: [137969.382572]  [<c0174914>] sys_faccessat+0x11e/0x15d
Aug  4 06:27:28 iranon kernel: [137969.382582]  [<c0178703>] sys_lstat64+0xf/0x23
Aug  4 06:27:28 iranon kernel: [137969.382588]  [<c017603c>] vfs_read+0xe3/0x11e
Aug  4 06:27:28 iranon kernel: [137969.382598]  [<c0174962>] sys_access+0xf/0x13
Aug  4 06:27:28 iranon kernel: [137969.382603]  [<c01077e8>] sysenter_past_esp+0x6d/0xa5
Aug  4 06:27:28 iranon kernel: [137969.382617]  =======================

Sysrq-T from dmesg (therefore truncated a bit) is attached.

Ian.

-- 
Ian Campbell

There are no answers, only cross-references.
		-- Weiner

View attachment "sysrq" of type "text/plain" (124791 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ