lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 28 Oct 2013 02:34:28 +0100
From:	Veaceslav Falico <vfalico@...hat.com>
To:	Ding Tianhong <dingtianhong@...wei.com>
Cc:	Jay Vosburgh <fubar@...ibm.com>,
	Andy Gospodarek <andy@...yhouse.net>,
	"David S. Miller" <davem@...emloft.net>,
	Nikolay Aleksandrov <nikolay@...hat.com>,
	Netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next v2 0/5] bonding: patchset for rcu use in bonding

On Mon, Oct 28, 2013 at 09:15:52AM +0800, Ding Tianhong wrote:
>On 2013/10/28 6:53, Veaceslav Falico wrote:
>> On Thu, Oct 24, 2013 at 11:08:35AM +0800, Ding Tianhong wrote:
>>> Hi:
>>>
>>> The slave list will add and del by bond_master_upper_dev_link() and bond_upper_dev_unlink(),
>>> which will call call_netdevice_notifiers(), even it is safe to call it in write bond lock now,
>>> but we can't sure that whether it is safe later, because other drivers may deal NETDEV_CHANGEUPPER
>>> in sleep way, so I didn't admit move the bond_upper_dev_unlink() in write bond lock.
>>>
>>> now the bond_for_each_slave only protect by rtnl_lock(), maybe use bond_for_each_slave_rcu is a good
>>> way to protect slave list for bond, but as a system slow path, it is no need to transform bond_for_each_slave()
>>> to bond_for_each_slave_rcu() in slow path, so in the patchset, I will remove the unused read bond lock
>>> for monitor function, maybe it is a better way, I will wait to accept any relay for it.
>>>
>>> Thanks for the Veaceslav Falico opinion.
>>>
>>> v2: add and modify commit for patchset and patch, it will be the first step for the whole patchset.
>>>
>>> Ding Tianhong (5):
>>>  bonding: remove bond read lock for bond_mii_monitor()
>>>  bonding: remove bond read lock for bond_alb_monitor()
>>>  bonding: remove bond read lock for bond_loadbalance_arp_mon()
>>>  bonding: remove bond read lock for bond_activebackup_arp_mon()
>>
>> This patch introduces a regression by boot-test with active backup mode:
>>
>> bond_activebackup_arp_mon() is already not holding the bond->lock, however
>> it might call bond_change_active_slave(), which does (in case of new_active):
>>
>>  912                         write_unlock_bh(&bond->curr_slave_lock);
>>  913                         read_unlock(&bond->lock);
>>  914  915                         call_netdevice_notifiers(NETDEV_BONDING_FAILOVER, bond->dev);
>>  916                         if (should_notify_peers)
>>  917                                 call_netdevice_notifiers(NETDEV_NOTIFY_PEERS,
>>  918                                                          bond->dev);
>>  919  920                         read_lock(&bond->lock);
>>  921                         write_lock_bh(&bond->curr_slave_lock);
>>
>> so it drops the bond->lock (which wasn't taken previously), and then takes
>> it (without anyone dropping it afterwards).
>>
>> I don't know how to fix it - cause a lot of other callers already take it,
>> and we can't just drop them (we'd race), and we can't remove it here (cause
>> we can't call notifiers while atomic).
>>
>> Which begs the question - was this patchset tested at all?
>>
>> [   21.796823] =====================================
>> [   21.796823] [ BUG: bad unlock balance detected! ]
... snip ...
>
>Hi David:
>yes, exactly I miss it and make a mistake, the bond_select_active_slave is still have the protect problem and
>need to be processed, I miss it, sorry, I will send a patch to fix the bug soon.
>
>Hi Veaceslav:
>sorry about the commit, I will pay more attention to the commit and test, thanks for your advise and report the bug,
>I have to admin that I was too careless.

I'll ask you once again, even though it seems that my NACK doesn't block
the patchset - try writing commit messages that actually describe why and
how you do it.

It's, actually, not only for the reviewers - it's also really good for you
- cause while writing the commit log you also understand a lot more what
are you doing, and might spot some corner cases (like this one).

Sorry for being negative, however it costs me *much* more time to review
patches without proper commit messages. I've done it once, twice, several
times more - but that's it, I refuse to spend my time on your skipped
homework.

It might also help to split patches into really small steps - as in - do
only one thing at a time, and describe it. This will help evade bugs *a
lot*. It also helps people who'll bisect it, bugfix it and review it -
because every patch will be a small, well-documented change, instead of a
chunk of code with a description 'lets remove bond->lock'.

Thank you and hope that helps.

>
>
>
>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to majordomo@...r.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> .
>>
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists