lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A395119.5060108@msgid.tls.msk.ru>
Date:	Thu, 18 Jun 2009 00:24:57 +0400
From:	Michael Tokarev <mjt@....msk.ru>
To:	"J. Bruce Fields" <bfields@...ldses.org>
CC:	Justin Piszcz <jpiszcz@...idpixels.com>,
	linux-kernel@...r.kernel.org
Subject: Re: 2.6.29.1: nfsd: page allocation failure - nfsd or kernel problem?

J. Bruce Fields wrote:
> On Wed, Jun 17, 2009 at 02:39:06PM +0400, Michael Tokarev wrote:
>> Justin Piszcz wrote:
>>>
>>> On Wed, 17 Jun 2009, Michael Tokarev wrote:
>>>
>>>> Michael Tokarev wrote:
>>>>> Justin Piszcz wrote:
>>>> ...
>>>>
>>>> Justin, by the way, what's the underlying filesystem on the server?
>>>>
>>>> I've seen this error on 2 machines already (both running 2.6.29.x  
>>>> x86-64),
>>>> and in both cases the filesystem on the server was xfs.  May this be
>>>> related somehow to http://bugzilla.kernel.org/show_bug.cgi?id=13375 ?
>>>> That one is different, but also about xfs and nfs.  I'm trying to
>>>> reproduce the problem on different filesystem...
>>> Hello, I am also running XFS on 2.6.29.x x86-64.
>>>
>>> For me, the error happened when I was running an XFSDUMP from a client  
>>> (and dumping) the stream over NFS to the XFS server/filesystem.  This 
>>> is typically when the error occurs or during heavy I/O.
>> Very similar load was here -- not xfsdump but tar and dump of an ext3
>> filesystems.
>>
>> And no, it's NOT xfs-related: I can trigger the same issue easily on

Note the NOT, in upper case ;)

>> ext4 as well.  About 20 minutes of running 'dump' of another fs
>> to the nfs mount and voila, nfs server reports the same page allocation
>> failure.  Note that all file operations are still working, i.e. it
>> produces good (not corrupted) files on the server.
> 
> There's a possibly related report for 2.6.30 here:
> 
> 	http://bugzilla.kernel.org/show_bug.cgi?id=13518

Does not look similar.

I repeated the issue here.  The slab which is growing here is buffer_head.
It's growing slowly -- right now, after ~5 minutes of constant writes over
nfs, its size is 428423 objects, growing at about 5000 objects/minute rate.
When stopping writing, the cache shrinks slowly back to an acceptable
size, probably when the data gets actually written to disk.

It looks like we need a bug entry for this :)

I'll re-try 2.6.30 hopefully tomorrow.

/mjt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ