linux-kernel - Re: nfsd deadlock, 2.6.36-rc3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 02 Sep 2010 09:13:42 -0600
From:	Tim Gardner <tim.gardner@...onical.com>
To:	"J. Bruce Fields" <bfields@...ldses.org>
CC:	Neil Brown <neilb@...e.de>, linux-nfs@...r.kernel.org,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Trond.Myklebust@...app.com
Subject: Re: nfsd deadlock, 2.6.36-rc3

On 09/01/2010 03:13 PM, J. Bruce Fields wrote:
> On Wed, Sep 01, 2010 at 03:11:23PM -0600, Tim Gardner wrote:
>> On 09/01/2010 02:55 PM, Neil Brown wrote:
>>> On Wed, 1 Sep 2010 12:54:01 -0400
>>> "J. Bruce Fields"<bfields@...ldses.org>   wrote:
>>>
>>>> On Wed, Sep 01, 2010 at 09:39:55AM -0600, Tim Gardner wrote:
>>>>> I've been pursuing a simple reproducer for an NFS lockup that shows
>>>>> up under stress. There is a bunch of info (some of it extraneous) in
>>>>> http://bugs.launchpad.net/bugs/561210. I can reproduce it by writing
>>>>> loop mounted NFS exports:
>>>>>
>>>>> /etc/fstab: 127.0.0.1:/srv /mnt/srv nfs rw 0 2
>>>>> /etc/exports: /srv 127.0.0.1(rw,insecure,no_subtree_check)
>>>>>
>>>>> See the attached scripts test_master.sh and test_client.sh. I simply
>>>>> repeat './test_master.sh wait' until nfsd locks up, typically within
>>>>> 1-3 cycles, e.g.,
>>>>
>>>> Without looking at the dmesg and scripts carefully to confirm, one
>>>> possible explanation is a deadlock when the server can't allocate memory
>>>> required to service client requests, memory which the client itself
>>>> needs to free by writing back dirty pages, but can't because the server
>>>> isn't processing its writes.
>>>
>>> Having looked closely I'd say it is almost certainly this issue.
>>> nfsd thread 1266 is in zone_reclaim waiting on a page to be written out so
>>> the memory can be reused.
>>> The other nfsd threads are blocking on a mutex held by 1266.
>>> The dd processes are waiting for pages to be written to the server
>>>
>>> The particular page that 1266 is waiting on is almost certainly a page on an
>>> NFS file, so you have a cyclic deadlock.
>>>
>>>>
>>>> For that reason we just don't support loopback mounts--they're OK for
>>>> light testing, but it would be difficult to make them completely robust
>>>> under load.
>>>
>>> I wonder if we could use 'containers' to partition available memory between
>>> 'nfsd threads' and 'everything else'??  Probably not worth the effort.
>>>
>>> NeilBrown
>>>
>>
>> I'm currently working with my support folks to reproduce this using
>> the exact same configuration as the customer, e.g., an NFS server
>> (running as a guest on a VMWare ESX host) serving multiple gigabit
>> clients.
>>
>> I assume that is a reasonable scenario?
>
> Assuming no VMWare problem (which I know nothing about), sure.
>
> --b.
>

The support folks were able to reproduce the failure using external 
clients after about 6 hours. We're thinking that its the same symptom as 
seen in https://bugzilla.kernel.org/show_bug.cgi?id=16056. That 
backported patch b608b283a962caaa280756bc8563016a71712acf from Trond was 
just incorporated into the Ubuntu 10.04 kernel, so they'll retest to see 
if its a bona-fide fix.

rtg
-- 
Tim Gardner tim.gardner@...onical.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/