lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51ABFE10.1030206@gmail.com>
Date:	Sun, 02 Jun 2013 21:23:12 -0500
From:	Rob Herring <robherring2@...il.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	netdev@...r.kernel.org
Subject: Re: panics in tcp_ack

On 06/02/2013 07:36 PM, Eric Dumazet wrote:
> On Sun, 2013-06-02 at 19:16 -0500, Rob Herring wrote:
>> Sorry, this time with proper line wrapping...
>>
>> I'm debugging a kernel panic in the networking stack that happens with a
>> cluster (20-40 nodes) of Calxeda highbank (ARM Cortex A9) nodes and
>> typically only after 10-24 hours. The node are transferring files
>> between nodes over TCP with 20 clients and servers per node. The kernel
>> is based on ubuntu 3.5 kernel which is based on 3.5.7.11. So far testing
>> has shown that 3.8.11 based (ubuntu raring) kernel is fixed. Attempts to
>> bisect have not yielded results as it seems multiple problems mask the
>> issue. Perhaps there is some new feature which has indirectly fixed the
>> problem in 3.8.
>>
>> This commit appears to fix a similar panic and seems to reduce the
>> frequency after picking it up in the latest 3.5 stable:
>>
>> commit 16fad69cfe4adbbfa813de516757b87bcae36d93
>> Author: Eric Dumazet <edumazet@...gle.com>
>> Date:   Thu Mar 14 05:40:32 2013 +0000
>>
>>     tcp: fix skb_availroom()
>>         Chrome OS team reported a crash on a Pixel ChromeBook in TCP stack :
>>         https://code.google.com/p/chromium/issues/detail?id=182056
>>         commit a21d45726acac (tcp: avoid order-1 allocations on wifi and tx
>>     path) did a poor choice adding an 'avail_size' field to skb, while
>>     what we really needed was a 'reserved_tailroom' one.
>>         It would have avoided commit 22b4a4f22da (tcp: fix retransmit of
>>     partially acked frames) and this commit.
>>         Crash occurs because skb_split() is not aware of the 'avail_size'
>>     management (and should not be aware)
>>         Signed-off-by: Eric Dumazet <edumazet@...gle.com>
>>     Reported-by: Mukesh Agrawal <quiche@...omium.org>
>>     Signed-off-by: David S. Miller <davem@...emloft.net>
>>
>> I've searched thru 3.8 and 3.9 stable fixes looking for possibly
>> relevant commits and applied these commits not in 3.5 stable. However,
>> they have not helped:
>>
>> net: drop dst before queueing fragments
>> tcp: call tcp_replace_ts_recent() from tcp_ack()
>> tcp: Reallocate headroom if it would overflow csum_start
>> tcp: incoming connections might use wrong route under synflood
>>
> 
> try also :
> 
> commit 093162553c33e94 (tcp: force a dst refcount when prequeue packet)
> commit 0d4f0608619de59 (tcp: dont handle MTU reduction on LISTEN socket)

Will add and test.

> commit 6731d2095bd4aef (tcp: fix for zero packets_in_flight was too
> broad)
> commit 2e5f421211ff76c (tcp: frto should not set snd_cwnd to 0)

I have these 2.

Meanwhile, here's another panic. This one is because struct tcphdr *th
is NULL which means skb->head is NULL. The skb is not NULL.

<4>[84967.163498] pc : [<c040798c>]    lr : [<c040eda8>]    psr: 600e0013
<4>[84967.163498] sp : ed335cc8  ip : 00000001  fp : 00000400
<4>[84967.174970] r10: ed346e34  r9 : 00000001  r8 : c06d71b8
<4>[84967.180188] r7 : 00000000  r6 : 00000000  r5 : ecd85840  r4 : ecd85840
<4>[84967.186709] r3 : 00000020  r2 : 0000003a  r1 : a4051080  r0 : ed346e00
<4>[84967.193234] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
Segment user
<4>[84967.200365] Control: 10c5387d  Table: 2d08804a  DAC: 00000015
<0>[84967.206109] Process python (pid: 883, stack limit = 0xed3342f0)
<0>[84967.212021] Stack: (0xed335cc8 to 0xed336000)
<0>[84967.216373] 5cc0:                   000005a8 00000000 ed346e00
c040ac08 c06a5a00 ecd85840
<0>[84967.224549] 5ce0: ed346e00 ed346e00 00000000 c06d71b8 ed346e34
c040eda8 ed346ea0 00000000
<0>[84967.232720] 5d00: 00000000 00000000 e9805380 0000000a 0000001c
ecd85840 00000000 ed346e00
<0>[84967.240897] 5d20: 00000000 c03b1d78 e9805380 ed346e00 0000fe88
3a61054b 00000400 00df2c34
<0>[84967.249075] 5d40: 00000040 c03fd2b8 0000a400 edf8c840 ed335eb0
ed335ed8 c23212f0 c23212e0
<0>[84967.257249] 5d60: 00df2c34 c17720e0 0000000e 00000400 00000400
000005a8 00000040 ed346ea0
<0>[84967.265419] 5d80: 00000000 00000000 ed334000 00000001 00010e30
00000630 00000000 00000000
<0>[84967.273591] 5da0: 0000000e 0000fe88 00000000 c06d6040 c2aeb380
ed346e00 ed335e30 eca26000
<0>[84967.281763] 5dc0: ed335ed8 00000400 00df2834 00000000 00000003
c041ea58 c795c2e8 ed4ecb50
<0>[84967.289935] 5de0: 00000000 ed335df0 eca26000 c03aef74 51ab6eeb
263fddc0 00000000 00000400
<0>[84967.298105] 5e00: eca26000 00000000 00000000 ed335ed8 01d0d6eb
c00cb4d8 00000056 00000000
<0>[84967.306294] 5e20: 91827364 ed335e24 00001000 00000001 ed9b4050
00000000 00000000 00000001
<0>[84967.314472] 5e40: ffffffff 00000000 00000000 00000000 00000000
00000000 ecc3de80 00000001
<0>[84967.322642] 5e60: 00000000 00000000 00001000 00000000 ed335df0
00000000 00001000 c0012f28
<0>[84967.330812] 5e80: fee00100 0002c000 00000000 ed335f88 ed9b4000
fffffdee ed334000 00000001
<0>[84967.338983] 5ea0: b6ae35f8 c010aa38 0002c000 00000000 00000400
eca26000 c06a4508 00000000
<0>[84967.347152] 5ec0: 00000040 c03b07d4 fffffff7 00000000 00df2834
00000400 00000000 00000000
<0>[84967.355321] 5ee0: ed335ed0 00000001 00000000 00000000 00000040
00000000 00000000 c0223254
<0>[84967.363495] 5f00: 00001000 00000000 00001000 00000000 00000001
ed9b4008 600e0013 ffffffff
<0>[84967.371666] 5f20: c000dbc4 c06ff504 ffffffff 00000000 00014be7
03614c11 ed335f90 00000000
<0>[84967.379858] 5f40: 0000000a ed335f68 c000dd28 ed334000 00000000
00000003 0000000a 0000000a
<0>[84967.388032] 5f60: 00000000 0002c000 00014bf1 00002710 00000001
271ae81b b6aecd90 00000000
<0>[84967.396203] 5f80: 00d25050 00000121 c000dd28 ed334000 00000000
c03b0828 00000000 00000000
<0>[84967.404376] 5fa0: be8f2890 c000db60 b6aecd90 00000000 00000006
00df2834 00000400 00000000
<0>[84967.412547] 5fc0: b6aecd90 00000000 00d25050 00000121 00000400
00df2834 b6ad4fd0 00000003
<0>[84967.420719] 5fe0: 00000000 be8f289c 000a5505 b6f7398c 600e0010
00000006 00000000 00000000
<4>[84967.428912] [<c040798c>] (tcp_rcv_established+0x20/0x5e0) from
[<c040eda8>] (tcp_v4_do_rcv+0xf0/0x2cc)
<4>[84967.438252] [<c040eda8>] (tcp_v4_do_rcv+0xf0/0x2cc) from
[<c03b1d78>] (release_sock+0x84/0xfc)
<4>[84967.446900] [<c03b1d78>] (release_sock+0x84/0xfc) from
[<c03fd2b8>] (tcp_sendmsg+0x378/0xcdc)
<4>[84967.455439] [<c03fd2b8>] (tcp_sendmsg+0x378/0xcdc) from
[<c041ea58>] (inet_sendmsg+0x80/0xb8)
<4>[84967.463966] [<c041ea58>] (inet_sendmsg+0x80/0xb8) from
[<c03aef74>] (sock_sendmsg+0xcc/0xec)
<4>[84967.472404] [<c03aef74>] (sock_sendmsg+0xcc/0xec) from
[<c03b07d4>] (sys_sendto+0xc0/0xfc)
<4>[84967.480670] [<c03b07d4>] (sys_sendto+0xc0/0xfc) from [<c03b0828>]
(sys_send+0x18/0x20)
<4>[84967.488599] [<c03b0828>] (sys_send+0x18/0x20) from [<c000db60>]
(ret_fast_syscall+0x0/0x30)


Rob

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ