[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5265DFF8.8090007@huawei.com>
Date: Tue, 22 Oct 2013 10:16:24 +0800
From: Ding Tianhong <dingtianhong@...wei.com>
To: Veaceslav Falico <vfalico@...hat.com>
CC: Jay Vosburgh <fubar@...ibm.com>,
Andy Gospodarek <andy@...yhouse.net>,
"David S. Miller" <davem@...emloft.net>,
Nikolay Aleksandrov <nikolay@...hat.com>,
Netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next 0/5] bonding: patchset for rcu use in bonding
On 2013/10/21 21:21, Veaceslav Falico wrote:
> On Mon, Oct 21, 2013 at 02:41:44PM +0200, Veaceslav Falico wrote:
>> On Mon, Oct 21, 2013 at 08:32:11PM +0800, Ding Tianhong wrote:
>>> On 2013/10/21 17:35, Veaceslav Falico wrote:
>>>> On Mon, Oct 21, 2013 at 05:27:51PM +0800, Ding Tianhong wrote:
>>>>> On 2013/10/21 17:13, Veaceslav Falico wrote:
>>>>>> On Mon, Oct 21, 2013 at 04:58:36PM +0800, Ding Tianhong wrote:
>>>>>>> Hi:
>>>>>>>
>>>>>>> The Patch Set will remove the invalid lock for bond work queue and replace it
>>>>>>> with rtnl lock, as read lock for bond could not protect slave list any more.
>>>>>>
>>>>>> rtnl lock is a lot more expensive than bond lock, and not only for bond,
>>>>>> but for all the networking stack.
>>>>>>
>>>>>> Why is the bond->lock invalid? It correctly protects slaves from being
>>>>>> modified concurrently.
>>>>>>
>>>>>> I don't see the point in this patchset.
>>>>>>
>>>>>
>>>>> yes, rtnl lock is a big lock, but I think bond->lock could not protect
>>>>> bond_for_each_slave any more, am I miss something?
>>>>
>>>> Why can't it protect bond_for_each_slave()?
>>>>
>>>
>>> bond_master_upper_dev_link() and bond_upper_dev_unlink() was only in rtnl lock,
>>> bond_for_each_slave may changed while loop in bond read lock, but it sees that
>>> nothing serious will happen yet.
>>> Maybe I miss something.
>>
>> Even if it is unsafe to use bond_for_each_slave() while holding bond->lock
>> - it means that we must protect the list by locking the
>> bond_upper_dev_(un)link() via bond->lock, but not by removing bond->lock
>> from everywhere where it is now. And I'm not that sure if it's safe or not.
>
> I've quickly looked over the code - yes, theoretically we could race
> between bond_for_each_slave() that is not rtnl-protected and
> bond_upper_dev_(un)link().
>
> Your patchset, also, doesn't 'replace' bond->lock with rtnl_lock(), cause
> everywhere the rtnl_lock() is already present, it just moves it around,
> while removing the bond->lock.
>
> The commit message is wrong and says actually nothing why is it done, how
> is it done, why it's safe to do so and what do we get in the end. For every
> patch, and the cover letter is also not an exception.
>
It is my fault, too lazy to describe the detail for the patch and occurs so many
missmatch, I'll fix it later.
> I'd suggest you either:
>
> 1) add bond->lock around bond_upper_dev_(un)link() (GFP_ATOMIC might be needed).
>
bond_upper_dev_(un)link() will call call_netdevice_notifiers(), it is not safe to call it
in read lock, call_netdevice_notifiers() may sleep or schedule, and sometimes
call_netdevice_notifiers() will read bond lock again.
> or
>
> 2) add ASSERT_RTNL() to bond_for_each_slave() macro, catch all the offenders
> and remove the bond->lock from them. Also, I'm not that sure that it's safe
> to do so - cause one of the slaves (not the slave list) might change, and
> we might have race conditions there.
>
yes, it is not a good idea, as your meaning, it is a big cost, but monitor is a slow path,
I think the cost is acceptable,whatever we should find a better way.
> or
>
> 3) move bond_for_each_slave() to bond_for_each_slave_rcu() where appropriate.
>
> And in any case write specific commit messages - bonding's code is really
> old and full of locks that were placed for some reason (and the reason
> might have gone away long ago, too), so it's really hard to say if the
> change is safe or not.
above all, the 3) is the wise idea, rtnl or rcu, every one is enough for
bond_for_each_slave().
>
> I'd personally go for either 3) (preferred), or 1).
>
> Sorry, I'm a bit tired of going in-depth on your patches. Start either
> doing patches with commit messages that *prove* me that you're right (I'll,
> obviously, verify it - but at least I'll know what you're doing, and won't
> have to figure it out from code), or I'll start explicitly NAKing them.
>
> Sorry again, but I don't really have time for that. I didn't have time to
> review your last patchset (RCUifying the remaining transmit path), and now
> I can understand nothing from their commit description, except that you've
> changed bond_for_each_slave() to bond_for_each_slave_rcu() and bond->lock
> to rcu_read_lock(). I'm not saying that they're wrong, just that they're
> really hard to understand.
>
> So, please, start writing commit messages.
>
>>
>>>
>>> Ding
>>>
>>>
>>>>>
>>>>> Ding
>>>>>
>>>>>>>
>>>>>>> Ding Tianhong (5):
>>>>>>> bonding: remove bond read lock for bond_mii_monitor()
>>>>>>> bonding: remove bond read lock for bond_alb_monitor()
>>>>>>> bonding: remove bond read lock for bond_loadbalance_arp_mon()
>>>>>>> bonding: remove bond read lock for bond_activebackup_arp_mon()
>>>>>>> bonding: remove bond read lock for bond_3ad_state_machine_handler()
>>>>>>>
>>>>>>> drivers/net/bonding/bond_3ad.c | 9 ++--
>>>>>>> drivers/net/bonding/bond_alb.c | 20 ++------
>>>>>>> drivers/net/bonding/bond_main.c | 100 +++++++++++++---------------------------
>>>>>>> 3 files changed, 40 insertions(+), 89 deletions(-)
>>>>>>>
>>>>>>> --
>>>>>>> 1.8.2.1
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>> .
>>>>
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@...r.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> .
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists