[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7603.1199920750@death>
Date: Wed, 09 Jan 2008 15:19:10 -0800
From: Jay Vosburgh <fubar@...ibm.com>
To: Herbert Xu <herbert@...dor.apana.org.au>
cc: Andy Gospodarek <andy@...yhouse.net>,
Krzysztof Oledzki <olel@....pl>, netdev@...r.kernel.org,
Jeff Garzik <jgarzik@...ox.com>,
David Miller <davem@...emloft.net>
Subject: Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
Herbert Xu <herbert@...dor.apana.org.au> wrote:
>On Wed, Jan 09, 2008 at 03:17:09PM -0500, Andy Gospodarek wrote:
>>
>> Agreed. And despite Herbert's opinion that this isn't the correct fix,
>> I think this will work fine. This is one of the cases where we can take
>> a write_lock(bond->lock) in softirq context, so we need to drop that (or
>> make sure all the read_lock's are read_lock_bh's). The latter isn't
>> really an option since having a majority of the bonding code run in
>> softirq context was what we are trying to avoid with the workqueue
>> conversion.
>
>No that's not the point. The point is to move the majority of the code
>into process context so that you can take the RTNL. Once you have taken
>the RTNL you can disable BH all you want and I don't care one bit.
I'm not sure how we could move more code into a process context;
much of the bonding driver is at the mercy of its callers, as in this
case. The monitoring stuff and enslave / deslave is all in a process
context now (workqueue). The transmit processing functions, for
example, can't be assumed to be in any particular context as they're
called by dev_queue_xmit.
The function in question here is the dev->set_multicast_list
method function, which is sometimes called with RTNL, and sometimes with
other random locks, but seems to always be called with netif_tx_lock_bh.
Then, bonding's bond_set_multicast_list calls dev_set_promiscuity (on
slaves), which I believe is an "RTNL only" function (which is probably
another discussion).
>In any case, fixing a known dead-lock is important.
Agreed.
I'm now thinking for just the topic at hand (the previously
posted lockdep report), something like this will resolve it:
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 77d004d..b7ac10b 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3937,8 +3937,6 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
struct bonding *bond = bond_dev->priv;
struct dev_mc_list *dmi;
- write_lock_bh(&bond->lock);
-
/*
* Do promisc before checking multicast_mode
*/
@@ -3959,6 +3957,8 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
bond_set_allmulti(bond, -1);
}
+ read_lock_bh(&bond->lock);
+
bond->flags = bond_dev->flags;
/* looking for addresses to add to slaves' mc list */
@@ -3979,7 +3979,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
bond_mc_list_destroy(bond);
bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC);
- write_unlock_bh(&bond->lock);
+ read_unlock_bh(&bond->lock);
}
/*
This (a) converts the write_lock_bh to a read_lock_bh, and moves
it down to not cover the promisc/allmulti code (which is protected by
RTNL). I'm not sure that this is really a signficant alteration, but
it's a bit closer to "correct."
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@...ibm.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists