[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <385751.1726158973@famine>
Date: Thu, 12 Sep 2024 09:36:13 -0700
From: Jay Vosburgh <jv@...sburgh.net>
To: Hangbin Liu <liuhangbin@...il.com>
cc: netdev@...r.kernel.org, Andy Gospodarek <andy@...yhouse.net>,
"David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Eric Dumazet <edumazet@...gle.com>,
Nikolay Aleksandrov <razor@...ckwall.org>,
Simon Horman <horms@...nel.org>, Aaron Conole <aconole@...hat.com>,
Ilya Maximets <i.maximets@....org>,
Adrian Moreno <amorenoz@...hat.com>,
Stanislas Faye <sfaye@...hat.com>
Subject: Re: [Discuss] ARP monitor for OVS bridge over bonding
Hangbin Liu <liuhangbin@...il.com> wrote:
>Hi all,
>
>Recently, our customer got an issue with OVS bridge over bonding. e.g.
>
> eth0 eth1
> | |
> -- bond0 --
> |
> br-ex (ovs-vsctl add-port br-ex bond0; ip addr add 192.168.1.1/24 dev br-ex)
>
>
>Before sending arp message for bond slave detecting, the bond need to check
>if the br-ex is in the same data path with bond0 via function
>bond_verify_device_path(), which using netdev_for_each_upper_dev_rcu()
>to check all upper devices. This works with normal bridge. But with ovs
>bridge, the upper device is "ovs-system" instead of br-ex.
>
>After talking with OVS developers. It turned out the real upper OVS topology
>is looks like
>
> --------------------------------
> | |
> br-ex -----+-- ovs-system |
> | |
> br-int -----+-- |
> | |
> | bond0 eth2 veth42 |
> | | | | |
> | | | | |
> -------+-------+-------+--------
> | | |
> +--+--+ physical |
> | | link |
> eth0 eth1 veth43
>
>The br-ex is not upper link of bond0. ovs-system, instead, is the master
>of bond0. This make us unable to make sure the br-ex and bond0 is in the
>same datapath.
I'm guessing that this is in the context of an openstack
deployment, as "br-ex" and "br-int" are names commonly chosen for the
OVS bridges in openstack.
But, yes, OVS bridge configuration is very different from the
linux bridge, and the ARP monitor was not designed with OVS in mind.
I'll also point out that OVS has its own bonding, although it
does not implement functionality equivalent to the ARP monitor.
However, OVS does provide an implementation of RFC 5880 BFD
(Bidirectional Forwarding Detection). The openstack deployments that
I'm familiar with typically use the kernel bonding in LACP mode along
with BFD. Is there a reason that OVS + BFD is unsuitable for your
purposes?
>On the other hand, as Adrián Moreno said, the packets generated on br-ex
>could be routed anywhere using OpenFlow rules (including eth2 in the
>diagram). The same with normal bridge, with tc/netfilter rules, the packets
>could also be routed to other interface instead of bond0.
True, and, at least in the openstack OVN/OVS deployments I'm
familiar with, heavy use of openflow rules is the usual configuration.
Those deployments also make use of tc rules for various purposes.
>So the rt interface checking in bond_arp_send_all() is not always correct.
>Stanislas suggested adding a new parameter like 'arp monitor source interface'
>to binding that the user could supply. Then we can do like
> If (rt->dst.dev == arp_src_iface->dev)
> goto found;
>
>What do you think?
A single "arp_src_iface" parameter won't scale if there are
multiple ARP targets, as each target might need a different
"arp_src_iface."
Also, the original purpose of bond_verify_device_path() is to
return VLAN tags in the device stack so that the ARP will be properly
tagged.
I think what you're really asking for is a "I know what I'm
doing" option to bypass the checks in bond_arp_send_all(). That would
also skip the VLAN tag search, so it's not necessarily a perfect
solution.
Before considering such a change, I'd like to know why OVS + BFD
over a kernel bond attached to the OVS bridge is unsuitable for your use
case, as that's a common configuration I've seen with OVS.
-J
---
-Jay Vosburgh, jv@...sburgh.net
Powered by blists - more mailing lists