lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <10582.1453788049@famine>
Date:	Mon, 25 Jan 2016 22:00:49 -0800
From:	Jay Vosburgh <jay.vosburgh@...onical.com>
To:	zhuyj <zyjzyj2000@...il.com>
cc:	mkubecek@...e.cz, vfalico@...il.com, gospo@...ulusnetworks.com,
	netdev@...r.kernel.org, boris.shteinbock@...driver.com,
	emil.s.tantilov@...el.com
Subject: Re: [PATCH 1/1] bonding: Use notifiers for slave link state detection

zhuyj <zyjzyj2000@...il.com> wrote:

>On 01/26/2016 08:43 AM, Jay Vosburgh wrote:
>> <zyjzyj2000@...il.com> wrote:
>>
>>> From: Zhu Yanjun <zyjzyj2000@...il.com>
>>>
>>> Bonding will utilize notifier callbacks to detect slave
>>> link state changes. It is intended to be used with miimon
>>> set to zero, and does not support the updelay or downdelay
>>> options to bonding.
>>>
>>> Because of link flap from the slave interface, if the notifier
>>> is NETDEV_UP while the actual link state is down, it is not
>>> necessary to continue.
>>>
>>> Signed-off-by: Jay Vosburgh <jay.vosburgh@...onical.com>
>> 	I haven't signed off on this patch.
>>
>> 	I've just started some testing, but as before immediately get an
>> RCU warning; it looks to be coming from bond_miimon_inspect_slave();
>>
>> [  316.473050] bond1: Enslaving eth1 as a backup interface with an up link
>> [  316.473059]
>> [  316.473806] ===============================
>> [  316.475630] [ INFO: suspicious RCU usage. ]
>> [  316.477519] 4.4.0+ #38 Not tainted
>> [  316.479094] -------------------------------
>> [  316.480765] drivers/net/bonding/bond_main.c:2024 suspicious rcu_dereference_check() usage!
>>
>> 	This is presumably because the "case NETDEV_DOWN" call to
>> bond_miimon_inspect_slave does not hold RCU.  It does hold RTNL, though,
>> which should be safe for this usage (RTNL mutexes changes to the active
>> slave).  The appended patch on top of the original makes the warning go
>> away.
>>
>> 	I'm still testing the patch and have no comment about its
>> functionality as yet.
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 9f67948..e3faee9 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -2014,14 +2014,14 @@ static int bond_slave_info_query(struct net_device *bond_dev, struct ifslave *in
>>     /*-------------------------------- Monitoring
>> -------------------------------*/
>>   -/* called with rcu_read_lock() */
>> +/* called with rcu_read_lock() or RTNL */
>>   static int bond_miimon_inspect_slave(struct bonding *bond, struct slave *slave,
>>   				     unsigned long event)
>>   {
>>   	int link_state;
>>   	bool ignore_updelay;
>>   -	ignore_updelay = !rcu_dereference(bond->curr_active_slave);
>> +	ignore_updelay = !rcu_dereference_rtnl(bond->curr_active_slave);
>
>Thanks a lot.
>Because kernel v4.4 needs this kind of patch, I backport this patch from
>net-next to kernel v4.4.
>
>If it is not appropriate, I will revert this patch.

	I don't understand what you mean here.

	I've tested the patch (with my above modification), and while I
seem to be hitting an unrelated bug in the ARP monitor, I believe this
patch will misbehave when the ARP monitor is running.

	For example, if arp_interval=1000 and miimon=0, the link state
notifier callback will change a slave to up should a notifier event take
place.  So, hypothetically, if a slave is "down" according to the ARP
monitor (but actually carrier up), and then experience a carrier down
then up transition, the slave would be set to "up" even though the ARP
monitor believes it to be down.

	I'm not able to induce the speedy link flap events, so I'm not
sure about this portion of the patch:

+	/* Because of link flap from the slave interface, it is possilbe that
+	 * the notifiler is NETDEV_UP while the actual link state is down. If
+	 * so, it is not necessary to contiune.
+	 */
+	switch (event) {
+	case NETDEV_UP:
+		if (!link_state)
+			return 0;
+		break;
+
+	case NETDEV_DOWN:
+		if (link_state)
+			return 0;
+		break;
+	}
+

	Unless I misunderstood, Emil's comments elsewhere suggest that
the current ixgbe driver won't cause those, though.

	-J

---
	-Jay Vosburgh, jay.vosburgh@...onical.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ