[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ3xEMgw=9sj3rdahPEiST_yDfDJPNSZZLRn43tnb3bK4_RPzg@mail.gmail.com>
Date: Tue, 25 Apr 2017 14:14:37 +0300
From: Or Gerlitz <gerlitz.or@...il.com>
To: Erez Shitrit <erezsh@....mellanox.co.il>
Cc: Honggang LI <honli@...hat.com>, Erez Shitrit <erezsh@...lanox.com>,
Doug Ledford <dledford@...hat.com>,
"Hefty, Sean" <sean.hefty@...el.com>,
Hal Rosenstock <hal.rosenstock@...il.com>,
Paolo Abeni <pabeni@...hat.com>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
Linux Kernel <linux-kernel@...r.kernel.org>,
Linux Netdev List <netdev@...r.kernel.org>
Subject: Re: [PATCH] IB/IPoIB: Check the headroom size
On Tue, Apr 25, 2017 at 2:11 PM, Erez Shitrit <erezsh@....mellanox.co.il> wrote:
> On Tue, Apr 25, 2017 at 1:32 PM, Or Gerlitz <gerlitz.or@...il.com> wrote:
>> On Tue, Apr 25, 2017 at 12:55 PM, Honggang LI <honli@...hat.com> wrote:
>>> From: Honggang Li <honli@...hat.com>
>>>
>>> Minimal hard_header_len set by bond_compute_features is ETH_HLEN, which
>>> is smaller than IPOIB_HARD_LEN. ipoib_hard_header should check the
>>> size of headroom to avoid skb_under_panic.
>>
>> sounds terrible, ipoib bonding is supported since ~2007, thanks for
>> reporting on that.
>>
>>> [ 122.871493] ipoib_hard_header: skb->head= ffff8808179d9400, skb->data= ffff8808179d9420, skb_headroom= 0x20
>>> [ 123.055400] bond0: Releasing backup interface mthca_ib1
>>> [ 123.560529] bond_compute_features:1112 bond0 bond_dev->hard_header_len = 14
>>> [ 123.568822] CPU: 0 PID: 12336 Comm: ifdown-ib Not tainted 4.9.0-debug #1
>>
>> did you generate this trace by calling dump_stack or this is existing
>> kernel code.
>>
>>> Fixes: fc791b633515 ('IB/ipoib: move back IB LL address into the hard header')
>>
>> this is more of WA to avoid some crash or failure but not fixing the
>> actual problem
>>
>> Erez, can you comment?
>
> We saw that after commit fc791b633515, it happened while removing bond
> interface after its slaves (ipoib interface) removed.
> At that point the bond interface sets its dev_harheader_len to be as
> eth interfaces (14 instead of 24), and if a process which doesn't
> aware of the slaves removal or was at the middle of the sending tries
> to send (igmp) packet it goes to ipoib with no space in the skb for
> it, and here comes the panic.
thanks for the info. Is this bug there since ipoib/bonding day one
(and hence my bug...)
or was indeed introduced later? if later, can you explain how
fc791b633515 introduced
that or you only know it by bisection?
> I agree with you that this fix is w/a, and it is a fix in the data
> path for all the packets while the panic is in a control flow. It
> probably should be fixed in the bonding driver.
so what's your suggestion? fc791b633515 is 6m old, and it means the bug
is in stable kernels and probably also in inbox drivers
Or.
Powered by blists - more mailing lists