lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150817165500.GA21512@vps.falico.eu>
Date:	Mon, 17 Aug 2015 18:55:00 +0200
From:	Veaceslav Falico <vfalico@...il.com>
To:	Jarod Wilson <jarod@...hat.com>
Cc:	linux-kernel@...r.kernel.org,
	Uwe Koziolek <uwe.koziolek@...knee.com>,
	Jay Vosburgh <j.vosburgh@...il.com>,
	Andy Gospodarek <gospo@...ulusnetworks.com>,
	netdev@...r.kernel.org
Subject: Re: [PATCH] net/bonding: send arp in interval if no active slave

On Mon, Aug 17, 2015 at 12:23:03PM -0400, Jarod Wilson wrote:
>From: Uwe Koziolek <uwe.koziolek@...knee.com>
>
>With some very finicky switch hardware, active backup bonding can get into
>a situation where we play ping-pong between interfaces, trying to get one
>to come up as the active slave. There seems to be an issue with the
>switch's arp replies either taking too long, or simply getting lost, so we
>wind up unable to get any interface up and active. Sometimes, the issue
>sorts itself out after a while, sometimes it doesn't.
>
>Testing with num_grat_arp has proven fruitless, but sending an additional
>arp on curr_arp_slave if we're still in the arp_interval timeslice in
>bond_ab_arp_probe(), has shown to produce 100% reliability in testing with
>this hardware combination.

Sorry, I don't understand the logic of why it works, and what exactly are
we fixiing here.

It also breaks completely the logic for link state management in case of no
current active slave for 2*arp_interval.

Could you please elaborate what exactly is fixed here, and how it works? :)

p.s. num_grat_arp maybe could help?

>
>[jarod: manufacturing of changelog]
>CC: Jay Vosburgh <j.vosburgh@...il.com>
>CC: Veaceslav Falico <vfalico@...il.com>
>CC: Andy Gospodarek <gospo@...ulusnetworks.com>
>CC: netdev@...r.kernel.org
>Signed-off-by: Uwe Koziolek <uwe.koziolek@...knee.com>
>Signed-off-by: Jarod Wilson <jarod@...hat.com>
>---
> drivers/net/bonding/bond_main.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 0c627b4..60b9483 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2794,6 +2794,11 @@ static bool bond_ab_arp_probe(struct bonding *bond)
> 			return should_notify_rtnl;
> 	}
>
>+	if (bond_time_in_interval(bond, curr_arp_slave->last_link_up, 2)) {
>+		bond_arp_send_all(bond, curr_arp_slave);
>+		return should_notify_rtnl;
>+	}
>+
> 	bond_set_slave_inactive_flags(curr_arp_slave, BOND_SLAVE_NOTIFY_LATER);
>
> 	bond_for_each_slave_rcu(bond, slave, iter) {
>-- 
>1.8.3.1
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ