[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1370219787.24311.113.camel@edumazet-glaptop>
Date: Sun, 02 Jun 2013 17:36:27 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Rob Herring <robherring2@...il.com>
Cc: netdev@...r.kernel.org
Subject: Re: panics in tcp_ack
On Sun, 2013-06-02 at 19:16 -0500, Rob Herring wrote:
> Sorry, this time with proper line wrapping...
>
> I'm debugging a kernel panic in the networking stack that happens with a
> cluster (20-40 nodes) of Calxeda highbank (ARM Cortex A9) nodes and
> typically only after 10-24 hours. The node are transferring files
> between nodes over TCP with 20 clients and servers per node. The kernel
> is based on ubuntu 3.5 kernel which is based on 3.5.7.11. So far testing
> has shown that 3.8.11 based (ubuntu raring) kernel is fixed. Attempts to
> bisect have not yielded results as it seems multiple problems mask the
> issue. Perhaps there is some new feature which has indirectly fixed the
> problem in 3.8.
>
> This commit appears to fix a similar panic and seems to reduce the
> frequency after picking it up in the latest 3.5 stable:
>
> commit 16fad69cfe4adbbfa813de516757b87bcae36d93
> Author: Eric Dumazet <edumazet@...gle.com>
> Date: Thu Mar 14 05:40:32 2013 +0000
>
> tcp: fix skb_availroom()
> Chrome OS team reported a crash on a Pixel ChromeBook in TCP stack :
> https://code.google.com/p/chromium/issues/detail?id=182056
> commit a21d45726acac (tcp: avoid order-1 allocations on wifi and tx
> path) did a poor choice adding an 'avail_size' field to skb, while
> what we really needed was a 'reserved_tailroom' one.
> It would have avoided commit 22b4a4f22da (tcp: fix retransmit of
> partially acked frames) and this commit.
> Crash occurs because skb_split() is not aware of the 'avail_size'
> management (and should not be aware)
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Reported-by: Mukesh Agrawal <quiche@...omium.org>
> Signed-off-by: David S. Miller <davem@...emloft.net>
>
> I've searched thru 3.8 and 3.9 stable fixes looking for possibly
> relevant commits and applied these commits not in 3.5 stable. However,
> they have not helped:
>
> net: drop dst before queueing fragments
> tcp: call tcp_replace_ts_recent() from tcp_ack()
> tcp: Reallocate headroom if it would overflow csum_start
> tcp: incoming connections might use wrong route under synflood
>
try also :
commit 093162553c33e94 (tcp: force a dst refcount when prequeue packet)
commit 0d4f0608619de59 (tcp: dont handle MTU reduction on LISTEN socket)
commit 6731d2095bd4aef (tcp: fix for zero packets_in_flight was too
broad)
commit 2e5f421211ff76c (tcp: frto should not set snd_cwnd to 0)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists