[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <13612.1306349942@death>
Date: Wed, 25 May 2011 11:59:02 -0700
From: Jay Vosburgh <fubar@...ibm.com>
To: Neil Horman <nhorman@...driver.com>
cc: netdev@...r.kernel.org, Andy Gospodarek <andy@...yhouse.net>,
nicolas.2p.debian@...il.com,
"David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH] bonding: prevent deadlock on slave store with alb mode (v3)
Neil Horman <nhorman@...driver.com> wrote:
>This soft lockup was recently reported:
>
>[root@...l-per715-01 ~]# echo +bond5 > /sys/class/net/bonding_masters
>[root@...l-per715-01 ~]# echo +eth1 > /sys/class/net/bond5/bonding/slaves
>bonding: bond5: doing slave updates when interface is down.
>bonding bond5: master_dev is not up in bond_enslave
>[root@...l-per715-01 ~]# echo -eth1 > /sys/class/net/bond5/bonding/slaves
>bonding: bond5: doing slave updates when interface is down.
>
>BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444]
>CPU 12:
>Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc
>be2d
>Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1
>RIP: 0010:[<ffffffff80064bf0>] [<ffffffff80064bf0>]
>.text.lock.spinlock+0x26/00
>RSP: 0018:ffff810113167da8 EFLAGS: 00000286
>RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025
>RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8
>RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c
>R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000
>R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282
>FS: 00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000
>CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0
>
>Call Trace:
> [<ffffffff80064af9>] _spin_lock_bh+0x9/0x14
> [<ffffffff886937d7>] :bonding:tlb_clear_slave+0x22/0xa1
> [<ffffffff8869423c>] :bonding:bond_alb_deinit_slave+0xba/0xf0
> [<ffffffff8868dda6>] :bonding:bond_release+0x1b4/0x450
> [<ffffffff8006457b>] __down_write_nested+0x12/0x92
> [<ffffffff88696ae4>] :bonding:bonding_store_slaves+0x25c/0x2f7
> [<ffffffff801106f7>] sysfs_write_file+0xb9/0xe8
> [<ffffffff80016b87>] vfs_write+0xce/0x174
> [<ffffffff80017450>] sys_write+0x45/0x6e
> [<ffffffff8005d28d>] tracesys+0xd5/0xe0
>
>It occurs because we are able to change the slave configuarion of a bond while
>the bond interface is down. The bonding driver initializes some data structures
>only after its ndo_open routine is called. Among them is the initalization of
>the alb tx and rx hash locks. So if we add or remove a slave without first
>opening the bond master device, we run the risk of trying to lock/unlock a
>spinlock that has garbage for data in it, which results in our above softlock.
>
>Note that sometimes this works, because in many cases an unlocked spinlock has
>the raw_lock parameter initialized to zero (meaning that the kzalloc of the
>net_device private data is equivalent to calling spin_lock_init), but thats not
>true in all cases, and we aren't guaranteed that condition, so we need to pass
>the relevant spinlocks through the spin_lock_init function.
>
>Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to
>the ndo_init path, so they are ready for use by the bond_store_slaves path.
>
>Change notes:
>v2) Based on conversation with Jay and Nicolas it seems that the ability to
>enslave devices while the bond master is down should be safe to do. As such
>this is an outlier bug, and so instead we'll just initalize the errant spinlocks
>in the init path rather than the open path, solving the problem. We'll also
>remove the warnings about the bond being down during enslave operations, since
>it should be safe
>
>v3) Fix spelling error
>
>Signed-off-by: Neil Horman <nhorman@...driver.com>
Signed-off-by: Jay Vosburgh <fubar@...ibm.com>
>Reported-by: jtluka@...hat.com
>CC: Jay Vosburgh <fubar@...ibm.com>
>CC: Andy Gospodarek <andy@...yhouse.net>
>CC: nicolas.2p.debian@...il.com
>CC: "David S. Miller" <davem@...emloft.net>
>---
> drivers/net/bonding/bond_alb.c | 4 ----
> drivers/net/bonding/bond_main.c | 16 ++++++++++------
> drivers/net/bonding/bond_sysfs.c | 6 ------
> 3 files changed, 10 insertions(+), 16 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
>index 8f2d2e7..2df9276 100644
>--- a/drivers/net/bonding/bond_alb.c
>+++ b/drivers/net/bonding/bond_alb.c
>@@ -163,8 +163,6 @@ static int tlb_initialize(struct bonding *bond)
> struct tlb_client_info *new_hashtbl;
> int i;
>
>- spin_lock_init(&(bond_info->tx_hashtbl_lock));
>-
> new_hashtbl = kzalloc(size, GFP_KERNEL);
> if (!new_hashtbl) {
> pr_err("%s: Error: Failed to allocate TLB hash table\n",
>@@ -747,8 +745,6 @@ static int rlb_initialize(struct bonding *bond)
> int size = RLB_HASH_TABLE_SIZE * sizeof(struct rlb_client_info);
> int i;
>
>- spin_lock_init(&(bond_info->rx_hashtbl_lock));
>-
> new_hashtbl = kmalloc(size, GFP_KERNEL);
> if (!new_hashtbl) {
> pr_err("%s: Error: Failed to allocate RLB hash table\n",
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 088fd84..c0045d7 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1542,12 +1542,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
> bond_dev->name, slave_dev->name);
> }
>
>- /* bond must be initialized by bond_open() before enslaving */
>- if (!(bond_dev->flags & IFF_UP)) {
>- pr_warning("%s: master_dev is not up in bond_enslave\n",
>- bond_dev->name);
>- }
>-
> /* already enslaved */
> if (slave_dev->flags & IFF_SLAVE) {
> pr_debug("Error, Device was already enslaved\n");
>@@ -4832,9 +4826,19 @@ static int bond_init(struct net_device *bond_dev)
> {
> struct bonding *bond = netdev_priv(bond_dev);
> struct bond_net *bn = net_generic(dev_net(bond_dev), bond_net_id);
>+ struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
>
> pr_debug("Begin bond_init for %s\n", bond_dev->name);
>
>+ /*
>+ * Initialize locks that may be required during
>+ * en/deslave operations. All of the bond_open work
>+ * (of which this is part) should really be moved to
>+ * a phase prior to dev_open
>+ */
>+ spin_lock_init(&(bond_info->tx_hashtbl_lock));
>+ spin_lock_init(&(bond_info->rx_hashtbl_lock));
>+
> bond->wq = create_singlethread_workqueue(bond_dev->name);
> if (!bond->wq)
> return -ENOMEM;
>diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
>index 4059bfc..bb1319f 100644
>--- a/drivers/net/bonding/bond_sysfs.c
>+++ b/drivers/net/bonding/bond_sysfs.c
>@@ -227,12 +227,6 @@ static ssize_t bonding_store_slaves(struct device *d,
> struct net_device *dev;
> struct bonding *bond = to_bond(d);
>
>- /* Quick sanity check -- is the bond interface up? */
>- if (!(bond->dev->flags & IFF_UP)) {
>- pr_warning("%s: doing slave updates when interface is down.\n",
>- bond->dev->name);
>- }
>-
> if (!rtnl_trylock())
> return restart_syscall();
>
>--
>1.7.5.2
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists