netdev - Re: Bond recovery from BOND_LINK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <10968.1509582913@famine>
Date:   Wed, 01 Nov 2017 17:35:13 -0700
From:   Jay Vosburgh <jay.vosburgh@...onical.com>
To:     Alex Sidorenko <alexandre.sidorenko@....com>
cc:     netdev@...r.kernel.org, Jarod Wilson <jarod@...hat.com>
Subject: Re: Bond recovery from BOND_LINK_FAIL state not working

Jay Vosburgh <jay.vosburgh@...onical.com> wrote:

>Alex Sidorenko <alexandre.sidorenko@....com> wrote:
>
>>The problem has been found while trying to deploy RHEL7 on HPE Synergy
>>platform, it is seen both in customer's environment and in HPE test lab.
>>
>>There are several bonds configured in TLB mode and miimon=100, all other
>>options are default. Slaves are connected to VirtualConnect
>>modules. Rebooting a VC module should bring one bond slave (ens3f0) down
>>temporarily, but not another one (ens3f1). But what we see is
>>
>>Oct 24 10:37:12 SYDC1LNX kernel: bond0: link status up again after 0 ms for interface ens3f1

	In net-next, I don't see a path in the code that will lead to
this message, as it would apparently require entering
bond_miimon_inspect in state BOND_LINK_FAIL but with downdelay set to 0.
If downdelay is 0, the code will transition to BOND_LINK_DOWN and not
remain in _FAIL state.

>>and it never recovers. When VC reboot is complete, everything goes back to normal again.
>>
>>Redhat has backported all recent upstream commits and instrumented the
>>bonding driver. We have found the following (when VC goes down)
>>
>>In bond_miimon_inspect() the first slave goes to
>>	bond_propose_link_state(slave, BOND_LINK_FAIL);
>>		and
>>	slave->new_link = BOND_LINK_DOWN;
>>
>>The second slave is still
>>	slave->link = BOND_LINK_UP;
>>		and
>>        slave->new_link = BOND_LINK_NOCHANGE;
>>
>>This is as expected. But in bond_miimon_commit() we see that _both_ slaves
>>are in BOND_LINK_FAIL.  That is, something changes the state of the second
>>slave from another thread. We suspect the NetworkManager, as the problem
>>is there _only_ when bonds are controlled by it, if we set
>>NM_CONTROLLED=no everything starts working normally.
>>
>>While we still do not understand how NM affects bond state, I think that
>>bonding driver needs to be made reliable enough to recover even from this
>>state.
>
>	You're suggesting that the bonding driver be able to distinguish
>"false" down states set by network manager from a genuine link failure?
>Am I misunderstanding?
>
>>At this moment when we enter bond_miimon_inspect() with
>>slave->link = BOND_LINK_FAIL and are in the following code
>>
>>                        /*FALLTHRU*/
>>                case BOND_LINK_BACK:
>>                        if (!link_state) {
>>                                bond_propose_link_state(slave, BOND_LINK_DOWN);
>>                                netdev_info(bond->dev, "link status down
>>again after %d ms for interface %s\n",
>>                                            (bond->params.updelay - slave->delay) *
>>                                            bond->params.miimon,
>>                                            slave->dev->name);
>>
>>                                commit++;
>>                                continue;
>>                        }
>>
>
>	Looking at bonding in the current net-next, if a slave enters
>bond_miimon_inspect with slave->link == BOND_LINK_FAIL, it will not
>proceed to the BOND_LINK_BACK block of the switch; BOND_LINK_FAIL does
>not fall through that far.
>
>	Did you actually mean the BOND_LINK_FAIL block of the switch?
>I'll assume so for the rest of my reply.
>
>>we propose a new state and do 'commit++', but we do not change
>>slave->new_link from BOND_LINK_NOCHANGE. As a result, bond_miimon_commit()
>>will not process this slave.
>>
>>The following patch fixes the issue:
>>
>>****
>>If we enter bond_miimon_inspect() with slave_link=BOND_LINK_FAIL
>>and recover, we do bond_propose_link_state(slave, BOND_LINK_UP);
>>but do not change slave->new_link, so it is left in
>>BOND_LINK_NOCHANGE. As a result, bond_miimon_commit() will not
>>process that slave and it never recovers. We need to set
>>slave->new_link = BOND_LINK_UP to make bond_miimon_commit() work
>
>	What is your downdelay set to?  In principle,
>bond_miimon_inspect should not enter with a slave in BOND_LINK_FAIL
>state if downdelay is 0.
>
>> drivers/net/bonding/bond_main.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>index c99dc59..07aa7ba 100644
>>--- a/drivers/net/bonding/bond_main.c
>>+++ b/drivers/net/bonding/bond_main.c
>>@@ -2072,6 +2072,7 @@ static int bond_miimon_inspect(struct bonding *bond)
>>                                            (bond->params.downdelay - slave->delay) *
>>                                            bond->params.miimon,
>>                                            slave->dev->name);
>>+                               slave->new_link = BOND_LINK_UP;
>>                                commit++;
>>                                continue;
>>                        }
>>-- 
>>2.7.4
>
>	Your patch does not apply to net-next, so I'm not absolutely
>sure where this is, but presuming that this is in the BOND_LINK_FAIL
>case of the switch, it looks like both BOND_LINK_FAIL and BOND_LINK_BACK
>will have the issue that if the link recovers or fails, respectively,
>within the delay window (for down/updelay > 0) it won't set a
>slave->new_link.
>
>	Looks like this got lost somewhere along the line, as originally
>the transition back to UP (or DOWN) happened immediately, and that has
>been lost somewhere.
>
>	I'll have to dig out when that broke, but I'll see about a test
>patch this afternoon.

	The case I was concerned with was moved around; the proposed
state is committed in bond_mii_monitor.  But to commit to _FAIL state,
the downdelay would have to be > 0.  I'm not seeing any errors in
net-next; can you reproduce your erroneous behavior on net-next?

	-J

---
	-Jay Vosburgh, jay.vosburgh@...onical.com