lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 06 Feb 2014 14:07:42 -0800 From: Jay Vosburgh <fubar@...ibm.com> To: Cong Wang <cwang@...pensource.com> Cc: Thomas Glanzmann <thomas@...nzmann.de>, Eric Dumazet <eric.dumazet@...il.com>, netdev <netdev@...r.kernel.org>, Veaceslav Falico <vfalico@...hat.com>, andy@...yhouse.net, Jiří Pírko <jiri@...nulli.us> Subject: Re: RTNL: assertion failed at net/core/dev.c (4494) and RTNL: assertion failed at net/core/rtnetlink.c (940) Jay Vosburgh <fubar@...ibm.com> wrote: >Cong Wang <cwang@...pensource.com> wrote: > >>On Thu, Feb 6, 2014 at 12:51 PM, Thomas Glanzmann <thomas@...nzmann.de> wrote: >>> Hello, >>> this morning I checked out Linus tip and compiled it after booting my >>> dmesg is full of: >>> >>> [ 8.944991] RTNL: assertion failed at net/core/dev.c (4494) >>> [ 8.950640] CPU: 3 PID: 388 Comm: kworker/u24:4 Not tainted 3.14.0-rc1+ #3 >>> [ 8.950642] Hardware name: Supermicro X9SRD-F/X9SRD-F, BIOS 1.0a 10/15/2012 >>> [ 8.950654] Workqueue: bond0 bond_3ad_state_machine_handler [bonding] >>> [ 8.950658] 0000000000000000 ffff881020c88000 ffffffff8138e219 ffff881020c88000 >>> [ 8.950664] ffffffff812d3091 ffff881023961040 ffffffff812e3132 0000000000000246 >>> [ 8.950670] 0000000000000020 ffff881020ab1be8 0000000020ab1ba8 0000000000000000 >>> [ 8.950675] Call Trace: >>> [ 8.950686] [<ffffffff8138e219>] ? dump_stack+0x41/0x51 >>> [ 8.950694] [<ffffffff812d3091>] ? netdev_master_upper_dev_get+0x2a/0x4d >>> [ 8.950699] [<ffffffff812e3132>] ? rtnl_fill_ifinfo+0x2c/0xac4 >>> [ 8.950707] [<ffffffff81072211>] ? print_time.part.5+0x50/0x54 >>> [ 8.950715] [<ffffffff812caf94>] ? __kmalloc_reserve.isra.42+0x2a/0x6d >>> [ 8.950721] [<ffffffff81102040>] ? ksize+0x12/0x1e >>> [ 8.950726] [<ffffffff812cb2b7>] ? __alloc_skb+0xb5/0x1a9 >>> [ 8.950731] [<ffffffff812e4626>] ? rtmsg_ifinfo+0x6c/0xd6 >>> [ 8.950739] [<ffffffffa035f4f9>] ? __enable_port.isra.17+0x51/0x5a [bonding] >>> [ 8.950747] [<ffffffffa0360463>] ? ad_agg_selection_logic+0x3d3/0x3ed [bonding] >>> [ 8.950754] [<ffffffffa0360d40>] ? bond_3ad_state_machine_handler+0x555/0x918 [bonding] >>> [ 8.950761] [<ffffffff8104db2d>] ? process_one_work+0x191/0x293 >>> [ 8.950766] [<ffffffff8104dfde>] ? worker_thread+0x121/0x1e7 >>> [ 8.950770] [<ffffffff8104debd>] ? rescuer_thread+0x269/0x269 >>> [ 8.950777] [<ffffffff810527b6>] ? kthread+0x99/0xa1 >>> [ 8.950782] [<ffffffff8105271d>] ? __kthread_parkme+0x59/0x59 >>> [ 8.950789] [<ffffffff8139733c>] ? ret_from_fork+0x7c/0xb0 >>> [ 8.950794] [<ffffffff8105271d>] ? __kthread_parkme+0x59/0x59 >> >> >>Hmm, rtmsg_ifinfo() should be called with rtnl lock, but >>__enable_port() is called >>with rcu_read_lock() which means we can't block inside it, therefore we probably >>should take rtnl lock outside: >> >>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c >>index cce1f1b..3c09ffa 100644 >>--- a/drivers/net/bonding/bond_3ad.c >>+++ b/drivers/net/bonding/bond_3ad.c >>@@ -2065,6 +2065,7 @@ void bond_3ad_state_machine_handler(struct >>work_struct *work) >> struct slave *slave; >> struct port *port; >> >>+ rtnl_lock(); >> read_lock(&bond->lock); >> rcu_read_lock(); >> >>@@ -2123,6 +2124,7 @@ void bond_3ad_state_machine_handler(struct >>work_struct *work) >> re_arm: >> rcu_read_unlock(); >> read_unlock(&bond->lock); >>+ rtnl_unlock(); >> queue_delayed_work(bond->wq, &bond->ad_work, ad_delta_in_ticks); >> } > > That would eliminate the warning, but is suboptimal. Acquiring >RTNL is not necessary on the vast majority of state machine runs >(because no state changes take place, i.e., no ports are disabled or >enabled). The above change would add 10 round trips per second to RTNL, >which seems excessive. > > Also, we cannot unconditionally acquire RTNL in this function, >as it would race with the call to cancel_delayed_work_sync from >bond_close (via bond_work_cancel_all). Thought of one more problem: we can't hold a regular lock while calling rtmsg_ifinfo, as it may sleep in alloc_skb. The rtmsg_ifinfo call has to be RTNL and nothing else. -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@...ibm.com -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists