lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 18 Aug 2008 19:50:48 +0100
From:	Athanasius <link@...gy.org>
To:	Grant Coady <gcoady.lk@...il.com>,
	linux-kernel <linux-kernel@...r.kernel.org>
Cc:	neilb@...e.de, bfields@...ldses.org, linux-nfs@...r.kernel.org
Subject: Re: NFS regression?  Odd delays and lockups accessing an NFS export.

On Mon, Aug 18, 2008 at 12:02:20PM +1000, Grant Coady wrote:
> I've been using NFS here for years, lately there's something odd going on 
> since about a month or so.  Previously reported last month:
> http://www.gossamer-threads.com/lists/linux/kernel/951419?page=last
> 
> Now with 2.6.27-rc3 on one of the client boxes I get a complete stall 
> at odd times when accessing the server's exported directory, cannot 
> see a pattern to it.  Eventually recovers after a Ctrl-C.  Nothing in 
> the server or client log files.  Not easy to reproduce either.

  I wonder if this is what I've been seeing.  I've been otherwise too
busy to properly report it, thinking that *someone* else must also be
seeing it and it's being worked on, else it's a subtle configuration
problem my end.

  I first started seeing this with 2.6.26 on the client end, 2.6.25.10 on
the server end.  Things last worked fine with that same server version
and 2.6.25.10 on the client end.  The affected export is:

/home/users     192.168.1.161(rw,no_root_squash,sync,no_subtree_check)

using the in-kernel NFS server, not the userspace one.

With the mount on the client being:

192.168.1.162:/home/users       /home/users nfs defaults,rw,nfsvers=3,rsize=8192,wsize=8192,nosuid,nodev,soft,intr 0 0

It used to have a 'nolock' in it, but I took that out and saw no
difference.  The one possible solid clue as to what's happening is this,
after I turned on all the lock etc checking in a 2.6.26.2 kernel:

Aug 12 16:10:28 emelia kernel: [  361.851316] INFO: task firefox-bin:5716 blocked for more than 120 seconds.
Aug 12 16:10:28 emelia kernel: [  361.851326] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 16:10:28 emelia kernel: [  361.851332] firefox-bin   D f7e0a018  5556  5716   5696
Aug 12 16:10:28 emelia kernel: [  361.851348]        e24f3ddc 00000046 c2cc2e40 f7e0a018 c0640030 c0642e40 c0642e40 c0642e40 
Aug 12 16:10:28 emelia kernel: [  361.851382]        e2ec1018 e2ec1270 c2cc2e40 00000001 c2cc2e40 e24f3db8 00000046 c2c0b680 
Aug 12 16:10:28 emelia kernel: [  361.851406]        e2e40da0 c2c0b680 e2ec1270 fffef1e4 00000001 00000046 c2c0b670 c2c0b670 
Aug 12 16:10:28 emelia kernel: [  361.851435] Call Trace:
Aug 12 16:10:28 emelia kernel: [  361.851449]  [<c01e9d71>] nfs_wait_bit_killable+0x2a/0x2e
Aug 12 16:10:28 emelia kernel: [  361.851462]  [<c041ee67>] __wait_on_bit+0x36/0x5d
Aug 12 16:10:28 emelia kernel: [  361.851475]  [<c01e9d47>] ? nfs_wait_bit_killable+0x0/0x2e
Aug 12 16:10:28 emelia kernel: [  361.851586]  [<c041ef3a>] out_of_line_wait_on_bit+0xac/0xb4
Aug 12 16:10:28 emelia kernel: [  361.851599]  [<c01e9d47>] ? nfs_wait_bit_killable+0x0/0x2e
Aug 12 16:10:28 emelia kernel: [  361.851613]  [<c0135286>] ? wake_bit_function+0x0/0x43
Aug 12 16:10:28 emelia kernel: [  361.851632]  [<c01e9d40>] nfs_wait_on_request+0x1f/0x26
Aug 12 16:10:28 emelia kernel: [  361.851642]  [<c01ed9cd>] nfs_sync_mapping_wait+0xde/0x27b
Aug 12 16:10:28 emelia kernel: [  361.851653]  [<c01e9c3d>] ? nfs_pageio_complete+0x8/0xa
Aug 12 16:10:28 emelia kernel: [  361.851673]  [<c01ee372>] __nfs_write_mapping+0x27/0x45
Aug 12 16:10:28 emelia kernel: [  361.851688]  [<c01ee3c9>] nfs_write_mapping+0x39/0x57
Aug 12 16:10:28 emelia kernel: [  361.851701]  [<c01ee40c>] nfs_wb_all+0x10/0x12
Aug 12 16:10:28 emelia kernel: [  361.851713]  [<c01e3169>] nfs_file_flush+0x8b/0xc8
Aug 12 16:10:28 emelia kernel: [  361.851723]  [<c0178505>] filp_close+0x31/0x5a
Aug 12 16:10:28 emelia kernel: [  361.851735]  [<c0126583>] put_files_struct+0x68/0xaa
Aug 12 16:10:28 emelia kernel: [  361.851747]  [<c01265fc>] exit_files+0x37/0x3c
Aug 12 16:10:28 emelia kernel: [  361.851759]  [<c0127988>] do_exit+0x21b/0x632
Aug 12 16:10:28 emelia kernel: [  361.851771]  [<c0127dfd>] do_group_exit+0x5e/0x85
Aug 12 16:10:28 emelia kernel: [  361.851786]  [<c0127e37>] sys_exit_group+0x13/0x15
Aug 12 16:10:28 emelia kernel: [  361.851799]  [<c0102e1d>] sysenter_past_esp+0x6a/0xb1
Aug 12 16:10:28 emelia kernel: [  361.851816]  =======================
Aug 12 16:10:28 emelia kernel: [  361.851821] INFO: lockdep is turned off.
Aug 12 16:10:30 emelia kernel: [  363.999492] INFO: task famd:2760 blocked for more than 120 seconds.
Aug 12 16:10:30 emelia kernel: [  363.999501] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 16:10:30 emelia kernel: [  363.999507] famd          D c093dd28  5684  2760      1
Aug 12 16:10:30 emelia kernel: [  363.999521]        f570dbe8 00000046 00000000 c093dd28 c0640030 c0642e40 c0642e40 c0642e40 
Aug 12 16:10:30 emelia kernel: [  363.999697]        f6dcc018 f6dcc270 c2cc2e40 00000001 c2cc2e40 f570dbc4 00000046 c2c027b0 
Aug 12 16:10:30 emelia kernel: [  363.999767]        f6516da0 c2c027b0 f6dcc270 fffefe66 00000001 00000046 c2c027a0 c2c027a0 
Aug 12 16:10:30 emelia kernel: [  363.999783] Call Trace:
Aug 12 16:10:30 emelia kernel: [  363.999796]  [<c0400fbd>] rpc_wait_bit_killable+0x2a/0x2e
Aug 12 16:10:30 emelia kernel: [  363.999808]  [<c041ee67>] __wait_on_bit+0x36/0x5d
Aug 12 16:10:30 emelia kernel: [  363.999817]  [<c0400f93>] ? rpc_wait_bit_killable+0x0/0x2e
Aug 12 16:10:30 emelia kernel: [  363.999828]  [<c041ef3a>] out_of_line_wait_on_bit+0xac/0xb4
Aug 12 16:10:30 emelia kernel: [  363.999836]  [<c0400f93>] ? rpc_wait_bit_killable+0x0/0x2e
Aug 12 16:10:30 emelia kernel: [  363.999846]  [<c0135286>] ? wake_bit_function+0x0/0x43
Aug 12 16:10:30 emelia kernel: [  363.999857]  [<c040157a>] __rpc_execute+0xcb/0x1e1
Aug 12 16:10:30 emelia kernel: [  363.999865]  [<c04016ab>] rpc_execute+0x1b/0x1e
Aug 12 16:10:30 emelia kernel: [  363.999872]  [<c03fc3d3>] rpc_run_task+0x43/0x49
Aug 12 16:10:30 emelia kernel: [  363.999881]  [<c03fc4de>] rpc_call_sync+0x44/0x5f
Aug 12 16:10:30 emelia kernel: [  363.999891]  [<c01eeeb8>] nfs3_rpc_wrapper+0x17/0x4d
Aug 12 16:10:30 emelia kernel: [  363.999900]  [<c01ef309>] nfs3_proc_access+0xc9/0x121
Aug 12 16:10:30 emelia kernel: [  363.999919]  [<c01e261a>] nfs_do_access+0x129/0x27c
Aug 12 16:10:30 emelia kernel: [  363.999931]  [<c01e2848>] nfs_permission+0xdb/0x13f
Aug 12 16:10:30 emelia kernel: [  363.999938]  [<c01e276d>] ? nfs_permission+0x0/0x13f
Aug 12 16:10:30 emelia kernel: [  363.999947]  [<c0180451>] permission+0x91/0xd0
Aug 12 16:10:30 emelia kernel: [  363.999956]  [<c01804b2>] vfs_permission+0x10/0x12
Aug 12 16:10:30 emelia kernel: [  363.999963]  [<c0181d56>] __link_path_walk+0x106/0xc1d
Aug 12 16:10:30 emelia kernel: [  363.999972]  [<c011764e>] ? kernel_map_pages+0xff/0x116
Aug 12 16:10:30 emelia kernel: [  363.999986]  [<c01828b9>] path_walk+0x4c/0x9b
Aug 12 16:10:30 emelia kernel: [  363.999994]  [<c0182c27>] do_path_lookup+0x198/0x1e1
Aug 12 16:10:30 emelia kernel: [  364.000003]  [<c018355d>] __user_walk_fd+0x2f/0x43
Aug 12 16:10:30 emelia kernel: [  364.000012]  [<c017ce43>] vfs_lstat_fd+0x16/0x3d
Aug 12 16:10:30 emelia kernel: [  364.000022]  [<c0130542>] ? groups_free+0x34/0x38
Aug 12 16:10:30 emelia kernel: [  364.000034]  [<c017cea8>] vfs_lstat+0x11/0x13
Aug 12 16:10:30 emelia kernel: [  364.000040]  [<c017cebe>] sys_lstat64+0x14/0x28
Aug 12 16:10:30 emelia kernel: [  364.000047]  [<c0130542>] ? groups_free+0x34/0x38
Aug 12 16:10:30 emelia kernel: [  364.000056]  [<c0130685>] ? set_current_groups+0x13f/0x149
Aug 12 16:10:30 emelia kernel: [  364.000067]  [<c01311d6>] ? sys_setresuid+0x153/0x165
Aug 12 16:10:30 emelia kernel: [  364.000077]  [<c0102e58>] ? sysenter_past_esp+0xa5/0xb1
Aug 12 16:10:30 emelia kernel: [  364.000086]  [<c0102e1d>] sysenter_past_esp+0x6a/0xb1
Aug 12 16:10:30 emelia kernel: [  364.000097]  =======================
Aug 12 16:10:30 emelia kernel: [  364.000100] INFO: lockdep is turned off.
Aug 12 16:10:33 emelia kernel: [  366.146658] INFO: task bash:5640 blocked for more than 120 seconds.
Aug 12 16:10:33 emelia kernel: [  366.146667] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 16:10:33 emelia kernel: [  366.146672] bash          D 00000000  6472  5640   5627
Aug 12 16:10:33 emelia kernel: [  366.146682]        e2d4bad4 00000046 00000000 00000000 c0640030 c0642e40 c0642e40 c0642e40 
Aug 12 16:10:33 emelia kernel: [  366.146699]        e2cee018 e2cee270 c2c70e40 00000000 c2c70e40 28d8cec5 00000037 c2c02b70 
Aug 12 16:10:33 emelia kernel: [  366.146713]        e2e43da0 c2c02b70 e2cee270 00000046 00000001 00000046 c2c02b60 c2c02b60 
Aug 12 16:10:33 emelia kernel: [  366.146728] Call Trace:
Aug 12 16:10:33 emelia kernel: [  366.146739]  [<c04208e1>] ? _spin_unlock_irqrestore+0x42/0x58
Aug 12 16:10:33 emelia kernel: [  366.146992]  [<c0400fbd>] rpc_wait_bit_killable+0x2a/0x2e
Aug 12 16:10:33 emelia kernel: [  366.147001]  [<c041ee67>] __wait_on_bit+0x36/0x5d
Aug 12 16:10:33 emelia kernel: [  366.147010]  [<c0400f93>] ? rpc_wait_bit_killable+0x0/0x2e
Aug 12 16:10:33 emelia kernel: [  366.147020]  [<c041ef3a>] out_of_line_wait_on_bit+0xac/0xb4
Aug 12 16:10:33 emelia kernel: [  366.147028]  [<c0400f93>] ? rpc_wait_bit_killable+0x0/0x2e
Aug 12 16:10:33 emelia kernel: [  366.147038]  [<c0135286>] ? wake_bit_function+0x0/0x43
Aug 12 16:10:33 emelia kernel: [  366.147050]  [<c040157a>] __rpc_execute+0xcb/0x1e1
Aug 12 16:10:33 emelia kernel: [  366.147058]  [<c04016ab>] rpc_execute+0x1b/0x1e
Aug 12 16:10:33 emelia kernel: [  366.147065]  [<c03fc3d3>] rpc_run_task+0x43/0x49
Aug 12 16:10:33 emelia kernel: [  366.147074]  [<c03fc4de>] rpc_call_sync+0x44/0x5f
Aug 12 16:10:33 emelia kernel: [  366.147083]  [<c01eeeb8>] nfs3_rpc_wrapper+0x17/0x4d
Aug 12 16:10:33 emelia kernel: [  366.147093]  [<c01ef4d4>] nfs3_proc_getattr+0x58/0x7a
Aug 12 16:10:33 emelia kernel: [  366.147102]  [<c01e4b31>] __nfs_revalidate_inode+0x122/0x26d
Aug 12 16:10:33 emelia kernel: [  366.147116]  [<c01e482b>] ? nfs_attribute_timeout+0x0/0xae
Aug 12 16:10:33 emelia kernel: [  366.147126]  [<c01e488c>] ? nfs_attribute_timeout+0x61/0xae
Aug 12 16:10:33 emelia kernel: [  366.147137]  [<c01e048b>] nfs_lookup_revalidate+0x245/0x471
Aug 12 16:10:33 emelia kernel: [  366.147160]  [<c0189b15>] ? __d_lookup+0x12a/0x146
Aug 12 16:10:33 emelia kernel: [  366.147171]  [<c01808ce>] do_lookup+0x115/0x154
Aug 12 16:10:33 emelia kernel: [  366.147180]  [<c0182466>] __link_path_walk+0x816/0xc1d
Aug 12 16:10:33 emelia kernel: [  366.147190]  [<c011764e>] ? kernel_map_pages+0xff/0x116
Aug 12 16:10:33 emelia kernel: [  366.147203]  [<c01828b9>] path_walk+0x4c/0x9b
Aug 12 16:10:33 emelia kernel: [  366.147212]  [<c0182c27>] do_path_lookup+0x198/0x1e1
Aug 12 16:10:33 emelia kernel: [  366.147220]  [<c0182ed0>] __path_lookup_intent_open+0x45/0x76
Aug 12 16:10:33 emelia kernel: [  366.147230]  [<c0182f11>] path_lookup_open+0x10/0x12
Aug 12 16:10:33 emelia kernel: [  366.147238]  [<c01838c9>] do_filp_open+0x9e/0x6cc
Aug 12 16:10:33 emelia kernel: [  366.147252]  [<c0178695>] ? get_unused_fd_flags+0xc3/0xcd
Aug 12 16:10:33 emelia kernel: [  366.147263]  [<c01786df>] do_sys_open+0x40/0xb6
Aug 12 16:10:33 emelia kernel: [  366.147271]  [<c0178797>] sys_open+0x1e/0x26
Aug 12 16:10:33 emelia kernel: [  366.147279]  [<c0102e1d>] sysenter_past_esp+0x6a/0xb1
Aug 12 16:10:33 emelia kernel: [  366.147290]  =======================
Aug 12 16:10:33 emelia kernel: [  366.147294] INFO: lockdep is turned off.
Aug 12 16:10:41 emelia kernel: [  374.736634] INFO: task gconfd-2:5492 blocked for more than 120 seconds.
Aug 12 16:10:41 emelia kernel: [  374.736644] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 16:10:41 emelia kernel: [  374.736648] gconfd-2      D e31a5dec  5132  5492      1
Aug 12 16:10:41 emelia kernel: [  374.736658]        e31a5e48 00000046 f62d3b28 e31a5dec c0640030 c0642e40 c0642e40 c0642e40 
Aug 12 16:10:41 emelia kernel: [  374.736674]        e44e9018 e44e9270 c2c70e40 00000000 c2c70e40 e31a5e24 00000046 c2c05280 
Aug 12 16:10:41 emelia kernel: [  374.736689]        f40a9da0 c2c05280 e44e9270 ffff24ad 00000001 00000046 c2c05270 c2c05270 
Aug 12 16:10:41 emelia kernel: [  374.736704] Call Trace:
Aug 12 16:10:41 emelia kernel: [  374.736716]  [<c01e9d71>] nfs_wait_bit_killable+0x2a/0x2e
Aug 12 16:10:41 emelia kernel: [  374.736728]  [<c041ee67>] __wait_on_bit+0x36/0x5d
Aug 12 16:10:41 emelia kernel: [  374.736737]  [<c01e9d47>] ? nfs_wait_bit_killable+0x0/0x2e
Aug 12 16:10:41 emelia kernel: [  374.736747]  [<c041ef3a>] out_of_line_wait_on_bit+0xac/0xb4
Aug 12 16:10:41 emelia kernel: [  374.736755]  [<c01e9d47>] ? nfs_wait_bit_killable+0x0/0x2e
Aug 12 16:10:41 emelia kernel: [  374.736765]  [<c0135286>] ? wake_bit_function+0x0/0x43
Aug 12 16:10:41 emelia kernel: [  374.736777]  [<c01e9d40>] nfs_wait_on_request+0x1f/0x26
Aug 12 16:10:41 emelia kernel: [  374.736784]  [<c01ed9cd>] nfs_sync_mapping_wait+0xde/0x27b
Aug 12 16:10:41 emelia kernel: [  374.736791]  [<c01e9c3d>] ? nfs_pageio_complete+0x8/0xa
Aug 12 16:10:41 emelia kernel: [  374.736805]  [<c01ee372>] __nfs_write_mapping+0x27/0x45
Aug 12 16:10:41 emelia kernel: [  374.736813]  [<c01ee3c9>] nfs_write_mapping+0x39/0x57
Aug 12 16:10:41 emelia kernel: [  374.736910]  [<c01ee40c>] nfs_wb_all+0x10/0x12
Aug 12 16:10:41 emelia kernel: [  374.736917]  [<c01e3169>] nfs_file_flush+0x8b/0xc8
Aug 12 16:10:41 emelia kernel: [  374.736926]  [<c0178505>] filp_close+0x31/0x5a
Aug 12 16:10:41 emelia kernel: [  374.736934]  [<c0178598>] sys_close+0x6a/0xa4
Aug 12 16:10:41 emelia kernel: [  374.736942]  [<c0102e1d>] sysenter_past_esp+0x6a/0xb1
Aug 12 16:10:41 emelia kernel: [  374.736954]  =======================
Aug 12 16:10:41 emelia kernel: [  374.736958] INFO: lockdep is turned off.

Server and client kernel configs attached.
-- 
- Athanasius = Athanasius(at)miggy.org / http://www.miggy.org/
                  Finger athan(at)fysh.org for PGP key
	   "And it's me who is my enemy. Me who beats me up.
Me who makes the monsters. Me who strips my confidence." Paula Cole - ME

View attachment "server-config" of type "text/plain" (62093 bytes)

View attachment "client-config" of type "text/plain" (61445 bytes)

Download attachment "signature.asc" of type "application/pgp-signature" (190 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ