netdev - Re: [PATCH] bonding: prevent deadlock on slave store with alb mode (v3)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <13612.1306349942@death>
Date:	Wed, 25 May 2011 11:59:02 -0700
From:	Jay Vosburgh <fubar@...ibm.com>
To:	Neil Horman <nhorman@...driver.com>
cc:	netdev@...r.kernel.org, Andy Gospodarek <andy@...yhouse.net>,
	nicolas.2p.debian@...il.com,
	"David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH] bonding: prevent deadlock on slave store with alb mode (v3)

Neil Horman <nhorman@...driver.com> wrote:

>This soft lockup was recently reported:
>
>[root@...l-per715-01 ~]# echo +bond5 > /sys/class/net/bonding_masters
>[root@...l-per715-01 ~]# echo +eth1 > /sys/class/net/bond5/bonding/slaves
>bonding: bond5: doing slave updates when interface is down.
>bonding bond5: master_dev is not up in bond_enslave
>[root@...l-per715-01 ~]# echo -eth1 > /sys/class/net/bond5/bonding/slaves
>bonding: bond5: doing slave updates when interface is down.
>
>BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444]
>CPU 12:
>Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc
>be2d
>Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1
>RIP: 0010:[<ffffffff80064bf0>]  [<ffffffff80064bf0>]
>.text.lock.spinlock+0x26/00
>RSP: 0018:ffff810113167da8  EFLAGS: 00000286
>RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025
>RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8
>RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c
>R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000
>R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282
>FS:  00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000
>CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0
>
>Call Trace:
> [<ffffffff80064af9>] _spin_lock_bh+0x9/0x14
> [<ffffffff886937d7>] :bonding:tlb_clear_slave+0x22/0xa1
> [<ffffffff8869423c>] :bonding:bond_alb_deinit_slave+0xba/0xf0
> [<ffffffff8868dda6>] :bonding:bond_release+0x1b4/0x450
> [<ffffffff8006457b>] __down_write_nested+0x12/0x92
> [<ffffffff88696ae4>] :bonding:bonding_store_slaves+0x25c/0x2f7
> [<ffffffff801106f7>] sysfs_write_file+0xb9/0xe8
> [<ffffffff80016b87>] vfs_write+0xce/0x174
> [<ffffffff80017450>] sys_write+0x45/0x6e
> [<ffffffff8005d28d>] tracesys+0xd5/0xe0
>
>It occurs because we are able to change the slave configuarion of a bond while
>the bond interface is down.  The bonding driver initializes some data structures
>only after its ndo_open routine is called.  Among them is the initalization of
>the alb tx and rx hash locks.  So if we add or remove a slave without first
>opening the bond master device, we run the risk of trying to lock/unlock a
>spinlock that has garbage for data in it, which results in our above softlock.
>
>Note that sometimes this works, because in many cases an unlocked spinlock has
>the raw_lock parameter initialized to zero (meaning that the kzalloc of the
>net_device private data is equivalent to calling spin_lock_init), but thats not
>true in all cases, and we aren't guaranteed that condition, so we need to pass
>the relevant spinlocks through the spin_lock_init function.
>
>Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to
>the ndo_init path, so they are ready for use by the bond_store_slaves path.
>
>Change notes:
>v2) Based on conversation with Jay and Nicolas it seems that the ability to
>enslave devices while the bond master is down should be safe to do.  As such
>this is an outlier bug, and so instead we'll just initalize the errant spinlocks
>in the init path rather than the open path, solving the problem.  We'll also
>remove the warnings about the bond being down during enslave operations, since
>it should be safe
>
>v3) Fix spelling error
>
>Signed-off-by: Neil Horman <nhorman@...driver.com>

Signed-off-by: Jay Vosburgh <fubar@...ibm.com>


>Reported-by: jtluka@...hat.com
>CC: Jay Vosburgh <fubar@...ibm.com>
>CC: Andy Gospodarek <andy@...yhouse.net>
>CC: nicolas.2p.debian@...il.com
>CC: "David S. Miller" <davem@...emloft.net>
>---
> drivers/net/bonding/bond_alb.c   |    4 ----
> drivers/net/bonding/bond_main.c  |   16 ++++++++++------
> drivers/net/bonding/bond_sysfs.c |    6 ------
> 3 files changed, 10 insertions(+), 16 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
>index 8f2d2e7..2df9276 100644
>--- a/drivers/net/bonding/bond_alb.c
>+++ b/drivers/net/bonding/bond_alb.c
>@@ -163,8 +163,6 @@ static int tlb_initialize(struct bonding *bond)
> 	struct tlb_client_info *new_hashtbl;
> 	int i;
>
>-	spin_lock_init(&(bond_info->tx_hashtbl_lock));
>-
> 	new_hashtbl = kzalloc(size, GFP_KERNEL);
> 	if (!new_hashtbl) {
> 		pr_err("%s: Error: Failed to allocate TLB hash table\n",
>@@ -747,8 +745,6 @@ static int rlb_initialize(struct bonding *bond)
> 	int size = RLB_HASH_TABLE_SIZE * sizeof(struct rlb_client_info);
> 	int i;
>
>-	spin_lock_init(&(bond_info->rx_hashtbl_lock));
>-
> 	new_hashtbl = kmalloc(size, GFP_KERNEL);
> 	if (!new_hashtbl) {
> 		pr_err("%s: Error: Failed to allocate RLB hash table\n",
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 088fd84..c0045d7 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1542,12 +1542,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
> 			   bond_dev->name, slave_dev->name);
> 	}
>
>-	/* bond must be initialized by bond_open() before enslaving */
>-	if (!(bond_dev->flags & IFF_UP)) {
>-		pr_warning("%s: master_dev is not up in bond_enslave\n",
>-			   bond_dev->name);
>-	}
>-
> 	/* already enslaved */
> 	if (slave_dev->flags & IFF_SLAVE) {
> 		pr_debug("Error, Device was already enslaved\n");
>@@ -4832,9 +4826,19 @@ static int bond_init(struct net_device *bond_dev)
> {
> 	struct bonding *bond = netdev_priv(bond_dev);
> 	struct bond_net *bn = net_generic(dev_net(bond_dev), bond_net_id);
>+	struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
>
> 	pr_debug("Begin bond_init for %s\n", bond_dev->name);
>
>+	/*
>+	 * Initialize locks that may be required during
>+	 * en/deslave operations.  All of the bond_open work
>+	 * (of which this is part) should really be moved to
>+	 * a phase prior to dev_open
>+	 */
>+	spin_lock_init(&(bond_info->tx_hashtbl_lock));
>+	spin_lock_init(&(bond_info->rx_hashtbl_lock));
>+
> 	bond->wq = create_singlethread_workqueue(bond_dev->name);
> 	if (!bond->wq)
> 		return -ENOMEM;
>diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
>index 4059bfc..bb1319f 100644
>--- a/drivers/net/bonding/bond_sysfs.c
>+++ b/drivers/net/bonding/bond_sysfs.c
>@@ -227,12 +227,6 @@ static ssize_t bonding_store_slaves(struct device *d,
> 	struct net_device *dev;
> 	struct bonding *bond = to_bond(d);
>
>-	/* Quick sanity check -- is the bond interface up? */
>-	if (!(bond->dev->flags & IFF_UP)) {
>-		pr_warning("%s: doing slave updates when interface is down.\n",
>-			   bond->dev->name);
>-	}
>-
> 	if (!rtnl_trylock())
> 		return restart_syscall();
>
>-- 
>1.7.5.2
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html