lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.0907301423250.18531@chino.kir.corp.google.com>
Date:	Thu, 30 Jul 2009 14:30:42 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Stephan von Krawczynski <skraw@...net.com>
cc:	"Rafael J. Wysocki" <rjw@...k.pl>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Kernel Testers List <kernel-testers@...r.kernel.org>,
	Justin Piszcz <jpiszcz@...idpixels.com>,
	Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
	Jesse Brandeburg <jesse.brandeburg@...el.com>,
	"David S. Miller" <davem@...emloft.com>
Subject: Re: [Bug #13648] nfsd: page allocation failure

On Mon, 27 Jul 2009, Stephan von Krawczynski wrote:

> This is no regression between 2.6.29 and 2.6.30.
> In fact we could reproduce the problem with kernel versions:
> 
> 2.6.27.26 < X <= 2.6.30.3
> 
> (Meaning 2.6.27.26 is the last one _not_ showing the problem).
> 

And 2.6.28.10 is showing the exact same problem as initially reported, 
right?

I noticed your /var/log/messages is showing you're using slub as opposed 
to slab (which Justin was using, and causing order-0 allocations errors).  
SLUB uses order-1 allocations for this cache growth and it's failing 
because of memory fragmentation, not because you're truly oom.

The only thing that is immediately apparent that changed in this path over 
these kernel versions (there were significant changes to e1000e) is the 
CRC stripping.  If it's loaded as a module, perhaps you could try

	modprobe e1000e CrcStripping=0,0

(assuming you have two adapters).

I've cc'd some relevant e1000e driver people in the hopes they'll be able 
to diagnose this problem.  Memory fragmentation as the result of page 
group changes wouldn't affect order-0 allocations such as this on slab, so 
it's doubtful the VM regressed if you can reproduce the problem with 
CONFIG_SLAB.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ