linux-kernel - Re: [Bug #13648] nfsd: page allocation failure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 31 Jul 2009 13:48:43 +0200
From:	Stephan von Krawczynski <skraw@...net.com>
To:	David Rientjes <rientjes@...gle.com>
Cc:	"Rafael J. Wysocki" <rjw@...k.pl>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Kernel Testers List <kernel-testers@...r.kernel.org>,
	Justin Piszcz <jpiszcz@...idpixels.com>,
	Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
	Jesse Brandeburg <jesse.brandeburg@...el.com>,
	"David S. Miller" <davem@...emloft.com>
Subject: Re: [Bug #13648] nfsd: page allocation failure

On Thu, 30 Jul 2009 14:30:42 -0700 (PDT)
David Rientjes <rientjes@...gle.com> wrote:

> On Mon, 27 Jul 2009, Stephan von Krawczynski wrote:
> 
> > This is no regression between 2.6.29 and 2.6.30.
> > In fact we could reproduce the problem with kernel versions:
> > 
> > 2.6.27.26 < X <= 2.6.30.3
> > 
> > (Meaning 2.6.27.26 is the last one _not_ showing the problem).
> > 
> 
> And 2.6.28.10 is showing the exact same problem as initially reported, 
> right?

Yes, that is correct.
 
> I noticed your /var/log/messages is showing you're using slub as opposed 
> to slab (which Justin was using, and causing order-0 allocations errors).  
> SLUB uses order-1 allocations for this cache growth and it's failing 
> because of memory fragmentation, not because you're truly oom.

Originally I used slab, and as someone wanted me to test slub I tried. The
results looked pretty much the same to me.
 
> The only thing that is immediately apparent that changed in this path over 
> these kernel versions (there were significant changes to e1000e) is the 
> CRC stripping.  If it's loaded as a module, perhaps you could try
> 
> 	modprobe e1000e CrcStripping=0,0
> 
> (assuming you have two adapters).

I will try that.
 
> I've cc'd some relevant e1000e driver people in the hopes they'll be able 
> to diagnose this problem.  Memory fragmentation as the result of page 
> group changes wouldn't affect order-0 allocations such as this on slab, so 
> it's doubtful the VM regressed if you can reproduce the problem with 
> CONFIG_SLAB.

I can, as said before, the problem first showed up with slab.


-- 
Regards,
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/