[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <511ADEBB.1000701@genband.com>
Date: Tue, 12 Feb 2013 18:30:51 -0600
From: Chris Friesen <chris.friesen@...band.com>
To: Jay Vosburgh <fubar@...ibm.com>
CC: bonding-devel@...ts.sourceforge.net,
netdev <netdev@...r.kernel.org>,
Stephen Hemminger <shemminger@...tta.com>,
bridge@...ts.linux-foundation.org
Subject: Re: how to handle bonding failover when using a bridge over the bond?
On 02/12/2013 06:02 PM, Jay Vosburgh wrote:
> Chris Friesen<chris.friesen@...band.com> wrote:
>
>> I've got a scenario that seems to be not well handled with the current
>> bonding code in linux, but maybe I'm missing something.
>>
>> I have a physical host with two ethernet links that are bonded together
>> (active/backup). Each link is connected to a separate L2 switch, which
>> are in turn connected with a crosslink for redundancy.
>>
>> The physical host is running multiple virtual machines each with a virtual
>> adapter. The virtual adapters and the bond are all bridged together to
>> allow communication between the virtual machines, the host, and the
>> outside world.
>>
>> Now suppose one of the slave links fails. The bond device will failover to
>> the other slave and send out a gratuitous arp on the newly active slave.
>> This will cause the L2 switches to update their lookup tables for the MAC
>> address associated with the bond (so it now points to the newly active
>> slave), but doesn't update the MAC addresses associated with the various
>> virtual machines. If someone on the network sends a packet to one of the
>> virtual machines, the switch will try to send it over the failed slave.
>
> If the link failure is such that there is no carrier on the
> switch port, the switch will drop the forwarding entry for the virtual
> machine's MAC address from that port. The traffic for the VM's MAC
> would then flood to all ports, presumably including the link to the
> other switch, which wouldn't have a forwarding entry for the MAC, either
> (or it would be the switch link port), and would also flood it to all
> ports, one of which is the correct one.
This makes sense, though it wouldn't cover the case where the link only
loses carrier in one direction, or if the bond is using arp failover and
something fails beyond the first hop.
> Is this actually failing for you, or is this a thought
> experiment?
It actually failed. During a customer demo. :) From what I understand
it was a physical link pull, which (based on what you say above) should
have caused the switch to react appropriately.
I'll see if I can get some more information. Maybe the switches weren't
behaving properly or something.
>> What's the recommended solution for this? The logical solution would seem
>> to be to have something issue GARPs for each virtual machine when the bond
>> device fails over, but there doesn't seem to be any way to register for
>> notification (via rtnetlink for instance) when the bond fails over. I
>> could monitor for carrier loss, but that wouldn't work for the case where
>> bonding is using arp monitoring.
>
> There is a NETDEV_BONDING_FAILOVER notifier that is called for
> active-backup mode when a new active slave is assigned. The
> rtnetlink_event function is on that chain, and will send an rtnetlink
> message, although I don't see that the actual event is included in the
> message.
If I'm reading this right it will end up sending an RTM_NEWLINK message,
which seems a bit odd.
> The bond doesn't track all of the MACs that go through it, but
> the bridge presumably does, and could respond to the FAILOVER notifier
> with something to notify the switch that the port assignments for the
> various MACs have changed.
That would probably make sense. I've added the bridging folks, maybe
they'll have a suggestion how this sort of thing should be handled.
Chris
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists