[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4C7E73CB.7030603@canonical.com>
Date: Wed, 01 Sep 2010 09:39:55 -0600
From: Tim Gardner <tim.gardner@...onical.com>
To: linux-nfs@...r.kernel.org
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
neilb@...e.de, bfields@...ldses.org, Trond.Myklebust@...app.com
Subject: nfsd deadlock, 2.6.36-rc3
I've been pursuing a simple reproducer for an NFS lockup that shows up
under stress. There is a bunch of info (some of it extraneous) in
http://bugs.launchpad.net/bugs/561210. I can reproduce it by writing
loop mounted NFS exports:
/etc/fstab: 127.0.0.1:/srv /mnt/srv nfs rw 0 2
/etc/exports: /srv 127.0.0.1(rw,insecure,no_subtree_check)
See the attached scripts test_master.sh and test_client.sh. I simply
repeat './test_master.sh wait' until nfsd locks up, typically within 1-3
cycles, e.g.,
cd /mnt/srv
while true; do ./test_master.sh wait; done
Note that this test will run indefinitely if invoked from /srv, e.g.,
cd /srv
while true; do ./test_master.sh wait; done
This issue, or something like it, appears to exist as far back as I've
tested (Ubuntu Lucid 2.6.32.21). For now I'm assuming that, since the
symptoms are similar, any lockup bug found in -rc3 is the likely culprit.
See attached dmesg and config. Debug options of interest that I've
enabled are CONFIG_DEBUG_SLAB, CONFIG_DEBUG_SLAB_LEAK,
CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_MUTEXES.
dmesg.txt contains the initial 'INFO: task nfsd:1263 blocked for more
than 120 seconds.' complaints as well as information dumped from
echo d | sudo tee /proc/sysrq-trigger
echo w | sudo tee /proc/sysrq-trigger
Anything else I can provide?
rtg
--
Tim Gardner tim.gardner@...onical.com
Download attachment "test_client.sh" of type "application/x-sh" (116 bytes)
Download attachment "test_master.sh" of type "application/x-sh" (394 bytes)
View attachment "dmesg.txt" of type "text/plain" (221345 bytes)
View attachment "config.txt" of type "text/plain" (123181 bytes)
Powered by blists - more mailing lists