[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4718C651.5040804@psc.edu>
Date: Fri, 19 Oct 2007 10:59:29 -0400
From: John Heffner <jheffner@....edu>
To: Stephen Hemminger <shemminger@...ux-foundation.org>
CC: "David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: Fw: [Bug 9189] New: Oops in kernel 2.6.21-rc4 through 2.6.23,
page allocation failure
Stephen Hemminger wrote:
> Looks like a memory over commit with small machines??
>
> Begin forwarded message:
>
> Date: Fri, 19 Oct 2007 01:35:33 -0700 (PDT)
> From: bugme-daemon@...zilla.kernel.org
> To: shemminger@...ux-foundation.org
> Subject: [Bug 9189] New: Oops in kernel 2.6.21-rc4 through 2.6.23, page allocation failure
[snip]
> Problem Description:After recent upgrade to kernel 2.6.23 (from 2.6.20) I have
> started seeing kernel oops-es in networking code. The problem is 100%
> reproducible in my environment. I've seen two slightly different backtraces but
> both seem to be caused by the same commit.
>
> I've performed the git bisect and tracked down the problem to the commit:
> 53cdcc04c1e85d4e423b2822b66149b6f2e52c2c [TCP]: Fix tcp_mem[] initialization
>
> Once I reverse this commit in 2.6.23 the problem goes away (this is true also
> for the kernel version generated by git bisect, 2.6.21-rc4).
>
> Backtrace #1:
> page allocation failure. order:1, mode:0x20
> [<c0131581>] __alloc_pages+0x2e1/0x300
> [<c0144bee>] cache_alloc_refill+0x29e/0x4b0
> [<c0144e6e>] __kmalloc+0x6e/0x80
> [<c0227103>] __alloc_skb+0x53/0x110
> [<c024de5c>] tcp_collapse+0x1ac/0x370
> [<c024e11d>] tcp_prune_queue+0xfd/0x2c0
> [<c024eaad>] tcp_data_queue+0x7cd/0xbb0
> [<c0225c2d>] skb_checksum+0x4d/0x2a0
> [<c02504ee>] tcp_rcv_established+0x36e/0x6a0
> [<c02561e4>] tcp_v4_do_rcv+0xb4/0x2a0
> [<c0131379>] __alloc_pages+0xd9/0x300
> [<c0258269>] tcp_v4_rcv+0x6a9/0x6c0
> [<c023ddb1>] ip_local_deliver+0x91/0x110
> [<c023e130>] ip_rcv+0x230/0x3c0
> [<c0227103>] __alloc_skb+0x53/0x110
> [<c022b742>] netif_receive_skb+0x152/0x1e0
> [<c022ce6f>] process_backlog+0x6f/0xe0
> [<c022cf3c>] net_rx_action+0x5c/0xf0
> [<c0115af2>] __do_softirq+0x42/0x90
> [<c0115b67>] do_softirq+0x27/0x30
> [<c01044fd>] do_IRQ+0x3d/0x70
> [<c0115818>] sys_gettimeofday+0x28/0x80
> [<c0102967>] common_interrupt+0x23/0x28
> =======================
I'm not surprised that this commit would make a difference in this
situation, since it does change the fraction of memory TCP is allowed to
use. (If it really is too much in this situation, we should tweak the
function.) However, I don't think this is the root cause. Why does it
oops here when the allocation fails?
-John
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists