linux-kernel - Re: kernel 3.2.27 on arm: WARNING: at mm/page_alloc.c:2109 __alloc_pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1349368171.16011.79.camel@edumazet-glaptop>
Date:	Thu, 04 Oct 2012 18:29:31 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	mbizon@...ebox.fr
Cc:	David Madore <david+ml@...ore.org>,
	Francois Romieu <romieu@...zoreil.com>, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, Hugh Dickins <hughd@...gle.com>
Subject: Re: kernel 3.2.27 on arm: WARNING: at mm/page_alloc.c:2109
 __alloc_pages_nodemask+0x1d4/0x68c()

On Thu, 2012-10-04 at 18:02 +0200, Maxime Bizon wrote:
> On Fri, 2012-08-31 at 19:21 -0700, Hugh Dickins wrote:
> 
> Hi,
> 
> > Francois is right that a GFP_ATOMIC allocation from pskb_expand_head()
> > is failing, which can easily happen, and cause your "failed to reallocate
> > TX buffer" errors; but it's well worth looking up what's actually on
> > lines 2108 and 2109 of mm/page_alloc.c in 3.2.27:
> > 
> > 	if (order >= MAX_ORDER) {
> > 		WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN));
> > 
> > That was probably not a sane allocation request, it has gone out of range:
> > maybe the skb header is even corrupted.  If you're lucky, it might be
> > something that netdev will recognize as already fixed.
> 
> I have the same problem on the exact same hardware and found the cause:
> 
> Author: Eric Dumazet <eric.dumazet@...il.com>
> Date:   Tue Apr 10 20:08:39 2012 +0000
> 
>     net: allow pskb_expand_head() to get maximum tailroom
>     
>     [ Upstream commit 87151b8689d890dfb495081f7be9b9e257f7a2df ]
>     
> 
> It turns out this change has a bad side effect on drivers that uses
> skb_recycle(), in that case mv643xx_eth.c
> 
> Since skb_recycle() resets skb->data using (skb->head + NET_SKB_PAD), a
> recycled skb going multiple times through a path that needs to expand
> skb head will get bigger and bigger each time, and you eventually end up
> with an allocation failure.
> 
> An idea to fix this would be to pass needed skb size to skb_resize() and
> set skb->data to MIN(NET_SKB_PAD, (skb->end - skb->head - skb_size) / 2)
> 
> skb recycling gives a small speed boost, but does not get a lot of test
> coverage since only 3 drivers uses it
> 

Thanks Maxime

Sure we can probably fix this issue, but its really not worth the pain.

I would get rid of it, its superseded by build_skb() to get cache hot
skbs anyway, and more over, rx path now uses skb->head allocated from a
page fragment for optimal GRO/TCP coalescing behavior.

skb_recycle() assumes skb allocation is slow, but its not per se.

Cache line misses are expensive, thats the real issue.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/