[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF2d9ji-E4Og94X+NbcGs=WKSemryy6-WoPKzSVCaM13Rqjiug@mail.gmail.com>
Date: Wed, 9 Jul 2014 10:15:27 -0700
From: Mahesh Bandewar <maheshb@...gle.com>
To: Veaceslav Falico <vfalico@...hat.com>
Cc: Eric Dumazet <eric.dumazet@...il.com>,
Nikolay Aleksandrov <nikolay@...hat.com>,
Jay Vosburgh <fubar@...ibm.com>,
Andy Gospodarek <andy@...yhouse.net>,
David Miller <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>,
Maciej Zenczykowski <maze@...gle.com>
Subject: Re: [PATCH] bonding: Do not try to send packets over dead link in TLB mode.
On Wed, Jul 9, 2014 at 6:27 AM, Veaceslav Falico <vfalico@...hat.com> wrote:
> On Wed, Jul 09, 2014 at 03:17:29PM +0200, Eric Dumazet wrote:
>>
>> On Wed, 2014-07-09 at 14:04 +0200, Veaceslav Falico wrote:
>>>
>>> On Wed, Jul 09, 2014 at 12:25:43PM +0200, Nikolay Aleksandrov wrote:
>>> >On 07/09/2014 12:24 PM, Veaceslav Falico wrote:
>>> >> On Tue, Jul 08, 2014 at 06:09:58PM -0700, Mahesh Bandewar wrote:
>>> ...snip...
>>> >>> + spin_lock(&bond_info->slave_arr_lock);
>>> >>
>>> >> I don't think you can re-enter bond_alb_handle_link_change(), as it's
>>> >> protected either by rtnl or write-lock curr_active_slave.
>>> >>
>>> >Actually a very good catch :-)
>>> >Maybe the allocation above should be done with GFP_ATOMIC.
>>>
>>> For the record - it's indeed always under rtnl, so ASSERT_RTNL() (from
>>> your
>>> other email) is a good idea.
>>
>>
>> Strange. I basically suggested the ASSERT_RTNL() to Mahesh few days ago
>> and he tried this. But the assert triggered with miimon, so Mahesh added
>> back the spinlock.
>
>
> That's indeed strange... From the code:
>
> 2103 if (!rtnl_trylock()) {
> 2104 delay = 1;
> 2105 should_notify_peers = false;
> 2106 goto re_arm;
> 2107 }
> 2108
> 2109 bond_miimon_commit(bond);
> 2110
> 2111 rtnl_unlock(); /* might sleep, hold no other locks */
>
> And we can get there only through bond_miimon_commit(), as part of the
> miimon.
>
> Maybe you've hit the kmalloc(GFP_KERNEL) warning?
Hmm, actually as Eric mentioned, I did try removing the spinlock to
just use RTNL but assert failed and it wasn't sleepable fn called in
atomic context message (I assume that is what you are suggesting from
kmalloc warning).
I managed to find the stack trace for that -
RTNL: assertion failed at drivers/net/bonding/bond_alb.c (1371)
CPU: 2 PID: 3571 Comm: kworker/u16:7 Not tainted 3.11.10-smp-DEV #91
Workqueue: bond0 bond_mii_monitor [bonding]
ffff8801175e4800 ffff880113b53d38 ffffffff97f97cab 000000000000004d
ffff8801175c6800 ffff880113b53d58 ffffffffc046a17c ffff8801175c6800
ffff8801175e4800 ffff880113b53d88 ffffffffc045ef27 0000000000000000
Call Trace:
[<ffffffff97f97cab>] dump_stack+0x46/0x58
[<ffffffffc046a17c>] bond_alb_handle_link_change+0x16c/0x180 [bonding]
[<ffffffffc045ef27>] bond_handle_link_change+0x57/0x80 [bonding]
[<ffffffffc0462d79>] bond_mii_monitor+0x679/0x6e0 [bonding]
[<ffffffff97a6a7c0>] process_one_work+0x140/0x3f0
[<ffffffff97a6aef1>] worker_thread+0x121/0x370
[<ffffffff97a6add0>] ? rescuer_thread+0x320/0x320
[<ffffffff97a720a0>] kthread+0xc0/0xd0
[<ffffffff97a71fe0>] ? flush_kthread_worker+0x80/0x80
[<ffffffff97fa291c>] ret_from_fork+0x7c/0xb0
[<ffffffff97a71fe0>] ? flush_kthread_worker+0x80/0x80
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists