[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <500F032D.3070104@genband.com>
Date: Tue, 24 Jul 2012 14:18:53 -0600
From: Chris Friesen <chris.friesen@...band.com>
To: Jay Vosburgh <fubar@...ibm.com>
CC: Jiri Pirko <jiri@...nulli.us>, netdev <netdev@...r.kernel.org>,
andy@...yhouse.net
Subject: Re: bonding and SR-IOV -- do we need arp_validation for loadbalancing
too?
On 07/24/2012 12:13 PM, Jay Vosburgh wrote:
> Jiri Pirko<jiri@...nulli.us> wrote:
>
>> Tue, Jul 24, 2012 at 05:57:03PM CEST, chris.friesen@...band.com wrote:
>>> Hi all,
>>>
>>> We've been starting to look at bonding VFs from separate physical
>>> devices in a guest, but we've run into a problem.
>>>
>>> The host is bonding the corresponding PFs, and it uses arp
>>> monitoring. What we have found is that any broadcast traffic from
>>> the guest (if they enable arp monitoring, for example) will be seen
>>> by the internal L2 switch of the NIC and sent up into the host, where
>>> the bonding driver will count it as incoming packets and use it to
>>> mark the link as good.
>>>
>>> The only solutions I've been able to come up with are:
>>> 1) add arp validation for load balancing modes as well as active-backup.
>> This is my favourite.... No reason to not to turn arp validation on.
>> TEAM device (teamd arpping linkwatch) does arp or NSNA validation
>> always.
> How does that operate for a load balancing mode?
>
> For arp validate to function (as it's implemented in bonding),
> the arp requests (broadcasts) or the arp replies (unicasts) must be seen
> by each slave at regular intervals. Most load balance systems
> (etherchannel or 802.3ad, for example) don't flood the broadcast
> requests to all members of a channel group, and the unicast replies only
> go to one member.
>
> This generally results in either only one slave staying up, or
> slaves going up and down at odd intervals. The arp monitor for the load
> balance modes is already dependent upon there being a steady stream of
> traffic to all slaves, and can be unreliable in low traffic conditions
> (because not all slaves receive traffic with sufficient frequency).
In loadbalance mode wouldn't it just work similar to active-backup? If
it's a reply then verify that it came from the arp target, if it's a
request then check to see if it came from one of the other slaves.
In our case we have control over the L2 switches involved so we ensure
that the broadcast arp request is sent to all the other slaves, while
the reply comes back to the sender. I think we still have a window
where you could have a device with a faulty tx but functional rx and
never detect the problem in the monitor.
A more general solution might be to have the device driver also track
the time of the last incoming packet that came from the external network
(rather than a VF) and having the bond driver ignore those packets for
the purpose of link health. Doing this efficiently would likely require
some kind of hardware support though--as an example the 82599 seems to
support this with the "LB" bit in the rx descriptor.
Chris
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists