netdev - Re: Bond recovery from BOND_LINK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <28118.1509572045@famine>
Date:   Wed, 01 Nov 2017 14:34:05 -0700
From:   Jay Vosburgh <jay.vosburgh@...onical.com>
To:     Alex Sidorenko <alexandre.sidorenko@....com>
cc:     netdev@...r.kernel.org, Jarod Wilson <jarod@...hat.com>
Subject: Re: Bond recovery from BOND_LINK_FAIL state not working

Alex Sidorenko <alexandre.sidorenko@....com> wrote:

>The problem has been found while trying to deploy RHEL7 on HPE Synergy
>platform, it is seen both in customer's environment and in HPE test lab.
>
>There are several bonds configured in TLB mode and miimon=100, all other
>options are default. Slaves are connected to VirtualConnect
>modules. Rebooting a VC module should bring one bond slave (ens3f0) down
>temporarily, but not another one (ens3f1). But what we see is
>
>Oct 24 10:37:12 SYDC1LNX kernel: bond0: link status up again after 0 ms for interface ens3f1
>
>and it never recovers. When VC reboot is complete, everything goes back to normal again.
>
>Redhat has backported all recent upstream commits and instrumented the
>bonding driver. We have found the following (when VC goes down)
>
>In bond_miimon_inspect() the first slave goes to
>	bond_propose_link_state(slave, BOND_LINK_FAIL);
>		and
>	slave->new_link = BOND_LINK_DOWN;
>
>The second slave is still
>	slave->link = BOND_LINK_UP;
>		and
>        slave->new_link = BOND_LINK_NOCHANGE;
>
>This is as expected. But in bond_miimon_commit() we see that _both_ slaves
>are in BOND_LINK_FAIL.  That is, something changes the state of the second
>slave from another thread. We suspect the NetworkManager, as the problem
>is there _only_ when bonds are controlled by it, if we set
>NM_CONTROLLED=no everything starts working normally.
>
>While we still do not understand how NM affects bond state, I think that
>bonding driver needs to be made reliable enough to recover even from this
>state.

	You're suggesting that the bonding driver be able to distinguish
"false" down states set by network manager from a genuine link failure?
Am I misunderstanding?

>At this moment when we enter bond_miimon_inspect() with
>slave->link = BOND_LINK_FAIL and are in the following code
>
>                        /*FALLTHRU*/
>                case BOND_LINK_BACK:
>                        if (!link_state) {
>                                bond_propose_link_state(slave, BOND_LINK_DOWN);
>                                netdev_info(bond->dev, "link status down
>again after %d ms for interface %s\n",
>                                            (bond->params.updelay - slave->delay) *
>                                            bond->params.miimon,
>                                            slave->dev->name);
>
>                                commit++;
>                                continue;
>                        }
>

	Looking at bonding in the current net-next, if a slave enters
bond_miimon_inspect with slave->link == BOND_LINK_FAIL, it will not
proceed to the BOND_LINK_BACK block of the switch; BOND_LINK_FAIL does
not fall through that far.

	Did you actually mean the BOND_LINK_FAIL block of the switch?
I'll assume so for the rest of my reply.

>we propose a new state and do 'commit++', but we do not change
>slave->new_link from BOND_LINK_NOCHANGE. As a result, bond_miimon_commit()
>will not process this slave.
>
>The following patch fixes the issue:
>
>****
>If we enter bond_miimon_inspect() with slave_link=BOND_LINK_FAIL
>and recover, we do bond_propose_link_state(slave, BOND_LINK_UP);
>but do not change slave->new_link, so it is left in
>BOND_LINK_NOCHANGE. As a result, bond_miimon_commit() will not
>process that slave and it never recovers. We need to set
>slave->new_link = BOND_LINK_UP to make bond_miimon_commit() work

	What is your downdelay set to?  In principle,
bond_miimon_inspect should not enter with a slave in BOND_LINK_FAIL
state if downdelay is 0.

> drivers/net/bonding/bond_main.c | 1 +
> 1 file changed, 1 insertion(+)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index c99dc59..07aa7ba 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2072,6 +2072,7 @@ static int bond_miimon_inspect(struct bonding *bond)
>                                            (bond->params.downdelay - slave->delay) *
>                                            bond->params.miimon,
>                                            slave->dev->name);
>+                               slave->new_link = BOND_LINK_UP;
>                                commit++;
>                                continue;
>                        }
>-- 
>2.7.4

	Your patch does not apply to net-next, so I'm not absolutely
sure where this is, but presuming that this is in the BOND_LINK_FAIL
case of the switch, it looks like both BOND_LINK_FAIL and BOND_LINK_BACK
will have the issue that if the link recovers or fails, respectively,
within the delay window (for down/updelay > 0) it won't set a
slave->new_link.

	Looks like this got lost somewhere along the line, as originally
the transition back to UP (or DOWN) happened immediately, and that has
been lost somewhere.

	I'll have to dig out when that broke, but I'll see about a test
patch this afternoon.

	-J

---
	-Jay Vosburgh, jay.vosburgh@...onical.com