[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <316685.1731029549@famine>
Date: Thu, 07 Nov 2024 17:32:29 -0800
From: Jay Vosburgh <jv@...sburgh.net>
To: Hangbin Liu <liuhangbin@...il.com>
cc: netdev@...r.kernel.org
Subject: Re: [Question]: should we consider arp missed max during
bond_ab_arp_probe()?
Hangbin Liu <liuhangbin@...il.com> wrote:
>Hi Jay,
>
>Our QE reported that, when there is no active slave during
>bond_ab_arp_probe(), the slaves send the arp probe message one by one. This
>will flap the switch's mac table quickly, sometimes even make the switch stop
>learning mac address. So should we consider the arp missed max during
>bond_ab_arp_probe()? i.e. each slave has more chances to send probe messages
>before switch to another slave. What do you think?
Well, "quickly" here depends entirely on what the value of
arp_interval is. It's been quite a while since I looked into the
details of this particular behavior, but at the time I didn't see the
switches I had issue flap warnings. If memory serves, I usually tested
with arp_interval in the realm of 100ms, with anywhere from 2 to 6
interfaces in the bond.
What settings are you using for the bond, and what model of
switch exhibits the behavior you describe?
That said, the intent of the current implementation is to cycle
through the interfaces in the bond relatively quickly when no interfaces
are up, under the theory that such behavior finds an available interface
in the minimum time.
I'm not necessarily opposed to having each probe "step," so to
speak, perform multiple ARP probe checks. However, I wonder if this is
a complicated workaround for not wanting to change a configuration
setting on a switch, and it would only make things better by chance
(i.e., that the probes just happen to now take long enough to not run
afoul of the switch's time limit for some flap parameter).
-J
---
-Jay Vosburgh, jv@...sburgh.net
Powered by blists - more mailing lists