lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4718C651.5040804@psc.edu>
Date:	Fri, 19 Oct 2007 10:59:29 -0400
From:	John Heffner <jheffner@....edu>
To:	Stephen Hemminger <shemminger@...ux-foundation.org>
CC:	"David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: Fw: [Bug 9189] New: Oops in kernel 2.6.21-rc4 through 2.6.23,
 page allocation failure

Stephen Hemminger wrote:
> Looks like a memory over commit with small machines??
> 
> Begin forwarded message:
> 
> Date: Fri, 19 Oct 2007 01:35:33 -0700 (PDT)
> From: bugme-daemon@...zilla.kernel.org
> To: shemminger@...ux-foundation.org
> Subject: [Bug 9189] New: Oops in kernel 2.6.21-rc4 through 2.6.23, page allocation failure
[snip]
> Problem Description:After recent upgrade to kernel 2.6.23 (from 2.6.20) I have
> started seeing kernel oops-es in networking code. The problem is 100%
> reproducible in my environment. I've seen two slightly different backtraces but
> both seem to be caused by the same commit.
> 
> I've performed the git bisect and tracked down the problem to the commit:
> 53cdcc04c1e85d4e423b2822b66149b6f2e52c2c [TCP]: Fix tcp_mem[] initialization
> 
> Once I reverse this commit in 2.6.23 the problem goes away (this is true also
> for the kernel version generated by git bisect, 2.6.21-rc4).
> 
> Backtrace #1:
> page allocation failure. order:1, mode:0x20
>  [<c0131581>] __alloc_pages+0x2e1/0x300   
>  [<c0144bee>] cache_alloc_refill+0x29e/0x4b0
>  [<c0144e6e>] __kmalloc+0x6e/0x80
>  [<c0227103>] __alloc_skb+0x53/0x110
>  [<c024de5c>] tcp_collapse+0x1ac/0x370
>  [<c024e11d>] tcp_prune_queue+0xfd/0x2c0
>  [<c024eaad>] tcp_data_queue+0x7cd/0xbb0
>  [<c0225c2d>] skb_checksum+0x4d/0x2a0
>  [<c02504ee>] tcp_rcv_established+0x36e/0x6a0
>  [<c02561e4>] tcp_v4_do_rcv+0xb4/0x2a0
>  [<c0131379>] __alloc_pages+0xd9/0x300
>  [<c0258269>] tcp_v4_rcv+0x6a9/0x6c0
>  [<c023ddb1>] ip_local_deliver+0x91/0x110
>  [<c023e130>] ip_rcv+0x230/0x3c0
>  [<c0227103>] __alloc_skb+0x53/0x110
>  [<c022b742>] netif_receive_skb+0x152/0x1e0
>  [<c022ce6f>] process_backlog+0x6f/0xe0
>  [<c022cf3c>] net_rx_action+0x5c/0xf0
>  [<c0115af2>] __do_softirq+0x42/0x90
>  [<c0115b67>] do_softirq+0x27/0x30
>  [<c01044fd>] do_IRQ+0x3d/0x70
>  [<c0115818>] sys_gettimeofday+0x28/0x80
>  [<c0102967>] common_interrupt+0x23/0x28
>  =======================


I'm not surprised that this commit would make a difference in this 
situation, since it does change the fraction of memory TCP is allowed to 
use.  (If it really is too much in this situation, we should tweak the 
function.)  However, I don't think this is the root cause.  Why does it 
oops here when the allocation fails?

   -John
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ