[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
<MW3PR15MB39135B6B84163690576F95FDFAB02@MW3PR15MB3913.namprd15.prod.outlook.com>
Date: Sun, 13 Apr 2025 02:37:20 +0000
From: David Wilder <wilder@...ibm.com>
To: Ilya Maximets <i.maximets@....org>, Jay Vosburgh <jv@...sburgh.net>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"pradeeps@...ux.vnet.ibm.com" <pradeeps@...ux.vnet.ibm.com>,
Pradeep
Satyanarayana <pradeep@...ibm.com>,
Adrian Moreno Zapata
<amorenoz@...hat.com>,
Hangbin Liu <haliu@...hat.com>
Subject: RE: [PATCH net-next v1 1/1] bonding: Adding limmited support for ARP
monitoring with ovs.
>>> Adding limited support for the ARP Monitoring feature when ovs is
>>> configured above the bond. When no vlan tags are used in the configuration
>>> or when the tag is added between the bond interface and the ovs bridge arp
>>> monitoring will function correctly. The use of tags between the ovs bridge
>>> and the routed interface are not supported.
>>
>> Looking at the patch, it isn't really "adding support," but
>> rather is disabling the "is this IP address configured above the bond"
>> checks if the bond is a member of an OVS bridge. It also seems like it
>> would permit any ARP IP target, as long as the address is configured
>> somewhere on the system.
>>
>> Stated another way, the route lookup in bond_arp_send_all() for
>> the target IP address must return a device, but the logic to match that
>> device to the interface stack above the bond will always succeed if the
>> bond is a member of any OVS bridge.
>>
>> For example, given:
>>
>> [ eth0, eth1 ] -> bond0 -> ovs-br -> ovs-port IP=10.0.0.1
>> eth2 IP=20.0.0.2
>>
>> Configuring arp_ip_target=20.0.0.2 on bond0 would apparently
>> succeed after this patch is applied, and the bond would send ARPs for
>> 20.0.0.2.
>>
>>> For example:
>>> 1) bond0 -> ovs-br -> ovs-port (x.x.x.x) is supported
>>> 2) bond0 -> bond0.100 -> ovs-br -> ovs-port (x.x.x.x) is supported.
>>> 3) bond0 -> ovs-br -> ovs-port -> ovs-port.100 (x.x.x.x) is not supported.
>>>
>>> Configurations #1 and #2 were tested and verified to function corectly.
>>> In the second configuration the correct vlan tags were seen in the arp.
>>
>> Assuming that I'm understanding the behavior correctly, I'm not
>> sure that "if OVS then do whatever" is the right way to go, particularly
>> since this would still exhibit mysterious failures if a VLAN is
>> configured within OVS itself (case 3, above).
>
> Note: vlan can also be pushed or removed by the OpenFlow pipeline within
> openvswitch between the ovs-port and the bond0. So, there is actually no
> reliable way to detect the correct set of vlan tags that should be used.
> And also, even if the IP is assigned to the ovs-port that is part of the
> same OVS bridge, there is no guarantee that packets routed to that IP can
> actually egress from the bond0, as the forwarding rules inside the OVS
>datapath can be arbitrarily complex.
>
> And all that is not limited to OVS even, as the cover letter mentions, TC
> or nftables in the linux bridge or even eBPF or XDP programs are not that
> different complexity-wise and can do most of the same things breaking the
> assumptions bonding code makes.
>
>>
>> I understand that the architecture of OVS limits the ability to
>> do these sorts of checks, but I'm unconvinced that implementing this
>> support halfway is going to create more issues than it solves.
>>
>> Lastly, thinking out loud here, I'm generally loathe to add more
>> options to bonding, but I'm debating whether this would be worth an
>> "ovs-is-a-black-box" option somewhere, so that users would have to
>> opt-in to the OVS alternate realm.
> I agree that adding options is almost never a great solution. But I had a
> similar thought. I don't think this option should be limited to OVS though,
>as OVS is only one of the cases where the current verification logic is not
>sufficient.
>
What if we build on the arp_ip_target setting. Allow for a list of vlan tags
to be appended to each target. Something like: arp_ip_target=x.x.x.x[vlan,vlan,...].
If a list of tags is omitted it works as before, if a list is supplied assume we know what were doing
and use that instead of calling bond_verify_device_path(). An empty list would be valid.
>>
>> -J
>>
>>> Signed-off-by: David J Wilder <wilder@...ibm.com>
>>> Signed-off-by: Pradeep Satyanarayana <pradeeps@...ux.vnet.ibm.com>
>>> ---
>>> drivers/net/bonding/bond_main.c | 8 +++++++-
>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>> index 950d8e4d86f8..6f71a567ba37 100644
>>> --- a/drivers/net/bonding/bond_main.c
>>> +++ b/drivers/net/bonding/bond_main.c
>>> @@ -3105,7 +3105,13 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev,
>>> struct net_device *upper;
>>> struct list_head *iter;
>>>
>>> - if (start_dev == end_dev) {
>>> + /* If start_dev is an OVS port then we have encountered an openVswitch
>
> nit: Strange choice to capitalize 'V'. It should be all lowercase or a full
> 'Open vSwitch' instead.
>>> + * bridge and can't go any further. The programming of the switch table
>>> + * will determine what packets will be sent to the bond. We can make no
>>> + * further assumptions about the network above the bond.
>>> + */
>>> +
>>> + if (start_dev == end_dev || netif_is_ovs_port(start_dev)) {
>>> tags = kcalloc(level + 1, sizeof(*tags), GFP_ATOMIC);
>>> if (!tags)
>>> return ERR_PTR(-ENOMEM);
>>
>> ---
>> -Jay Vosburgh, jv@...sburgh.net
>
> Best regards, Ilya Maximets.
David Wilder (wilder@...ibm.com)
Powered by blists - more mailing lists