[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5560.1332217826@death.nxdomain>
Date: Mon, 19 Mar 2012 21:30:26 -0700
From: Jay Vosburgh <fubar@...ibm.com>
To: Joseph Glanville <joseph.glanville@...onvm.com.au>
cc: Roland Dreier <roland@...estorage.com>, linux-rdma@...r.kernel.org,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: Kernel Panic with bonding + IPoIB on 3.2.9
Joseph Glanville <joseph.glanville@...onvm.com.au> wrote:
>On 20 March 2012 06:05, Roland Dreier <roland@...estorage.com> wrote:
>> On Sun, Mar 18, 2012 at 1:21 PM, Joseph Glanville
>> <joseph.glanville@...onvm.com.au> wrote:
>>> [ 422.047024] kernel BUG at net/core/dev.c:1896!
>>
>> So this line is
>>
>> BUG_ON(offset >= skb_headlen(skb));
>>
>> right? No paritcular idea how we hit this, though...
>
>Yep... I have looked through most of /drivers/net/bonding and I can't
>really see why it should be blowing up there.. it really should cause
>the BUG_ON under normal IPoIB if the MTU was the cause - yet I have
>not experienced this.
>The bonding code doesn't seem to do anything special with the MTU
>other than propagating changes to the slaves.
For IPoIB, though, there is some extra initialization stuff in
bond_setup_by_slave(), and the hard_header_len will end up being set to
something different from the usual Ethernet value.
In looking at ipoib_setup, I see that hard_header_len appears to
be set to 4 (IPOIB_ENCAP_LEN). My recollection was that the IPoIB
hard_header_len was quite a bit larger than that; it looks like it
changed very recently from IPOIB_ENCAP_LEN + INFINIBAND_ALEN to what it
is now:
commit afd87adacb5de00768b2e54f0bd851278f2e6179
Author: Roland Dreier <roland@...estorage.com>
Date: Tue Feb 7 14:51:21 2012 +0000
IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses
[ Upstream commit 936d7de3d736e0737542641269436f4b5968e9ef ]
Commit a0417fa3a18a ("net: Make qdisc_skb_cb upper size bound
explicit.") made it possible for a netdev driver to use skb->cb
between its header_ops.create method and its .ndo_start_xmit
method. Use this in ipoib_hard_header() to stash away the LL address
(GID + QPN), instead of the "ipoib_pseudoheader" hack. This allows
IPoIB to stop lying about its hard_header_len, which will let us fix
the L2 check for GRO.
I don't know if this change could be causing the problem (it
appears to be new in 3.2.9), but the hard_header_len is one of the few
areas in the TX path of bonding that IPoIB ends up being different from
regular Ethernet.
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@...ibm.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists