[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <95a773f6-5f88-712e-c494-9414d7090144@blackwall.org>
Date: Wed, 19 Apr 2023 15:30:07 +0300
From: Nikolay Aleksandrov <razor@...ckwall.org>
To: Ido Schimmel <idosch@...dia.com>, netdev@...r.kernel.org,
bridge@...ts.linux-foundation.org
Cc: davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
edumazet@...gle.com, roopa@...dia.com, petrm@...dia.com,
mlxsw@...dia.com
Subject: Re: [RFC PATCH net-next 0/9] bridge: Add per-{Port, VLAN} neighbor
suppression
On 13/04/2023 12:58, Ido Schimmel wrote:
> Background
> ==========
>
> In order to minimize the flooding of ARP and ND messages in the VXLAN
> network, EVPN includes provisions [1] that allow participating VTEPs to
> suppress such messages in case they know the MAC-IP binding and can
> reply on behalf of the remote host. In Linux, the above is implemented
> in the bridge driver using a per-port option called "neigh_suppress"
> that was added in kernel version 4.15 [2].
>
> Motivation
> ==========
>
> Some applications use ARP messages as keepalives between the application
> nodes in the network. This works perfectly well when two nodes are
> connected to the same VTEP. When a node goes down it will stop
> responding to ARP requests and the other node will notice it
> immediately.
>
> However, when the two nodes are connected to different VTEPs and
> neighbor suppression is enabled, the local VTEP will reply to ARP
> requests even after the remote node went down, until certain timers
> expire and the EVPN control plane decides to withdraw the MAC/IP
> Advertisement route for the address. Therefore, some users would like to
> be able to disable neighbor suppression on VLANs where such applications
> reside and keep it enabled on the rest.
>
> Implementation
> ==============
>
> The proposed solution is to allow user space to control neighbor
> suppression on a per-{Port, VLAN} basis, in a similar fashion to other
> per-port options that gained per-{Port, VLAN} counterparts such as
> "mcast_router". This allows users to benefit from the operational
> simplicity and scalability associated with shared VXLAN devices (i.e.,
> external / collect-metadata mode), while still allowing for per-VLAN/VNI
> neighbor suppression control.
>
> The user interface is extended with a new "neigh_vlan_suppress" bridge
> port option that allows user space to enable per-{Port, VLAN} neighbor
> suppression on the bridge port. When enabled, the existing
> "neigh_suppress" option has no effect and neighbor suppression is
> controlled using a new "neigh_suppress" VLAN option. Example usage:
>
> # bridge link set dev vxlan0 neigh_vlan_suppress on
> # bridge vlan add vid 10 dev vxlan0
> # bridge vlan set vid 10 dev vxlan0 neigh_suppress on
>
> Testing
> =======
>
> Tested using existing bridge selftests. Added a dedicated selftest in
> the last patch.
>
> Patchset overview
> =================
>
> Patches #1-#5 are preparations.
>
> Patch #6 adds per-{Port, VLAN} neighbor suppression support to the
> bridge's data path.
>
> Patches #7-#8 add the required netlink attributes to enable the feature.
>
> Patch #9 adds a selftest.
>
> iproute2 patches can be found here [3].
>
> [1] https://www.rfc-editor.org/rfc/rfc7432#section-10
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a42317785c898c0ed46db45a33b0cc71b671bf29
> [3] https://github.com/idosch/iproute2/tree/submit/neigh_suppress_v1
>
> Ido Schimmel (9):
> bridge: Reorder neighbor suppression check when flooding
> bridge: Pass VLAN ID to br_flood()
> bridge: Add internal flags for per-{Port, VLAN} neighbor suppression
> bridge: Take per-{Port, VLAN} neighbor suppression into account
> bridge: Encapsulate data path neighbor suppression logic
> bridge: Add per-{Port, VLAN} neighbor suppression data path support
> bridge: vlan: Allow setting VLAN neighbor suppression state
> bridge: Allow setting per-{Port, VLAN} neighbor suppression state
> selftests: net: Add bridge neighbor suppression test
>
> include/linux/if_bridge.h | 1 +
> include/uapi/linux/if_bridge.h | 1 +
> include/uapi/linux/if_link.h | 1 +
> net/bridge/br_arp_nd_proxy.c | 33 +-
> net/bridge/br_device.c | 8 +-
> net/bridge/br_forward.c | 8 +-
> net/bridge/br_if.c | 2 +-
> net/bridge/br_input.c | 2 +-
> net/bridge/br_netlink.c | 8 +-
> net/bridge/br_private.h | 5 +-
> net/bridge/br_vlan.c | 1 +
> net/bridge/br_vlan_options.c | 20 +-
> net/core/rtnetlink.c | 2 +-
> tools/testing/selftests/net/Makefile | 1 +
> .../net/test_bridge_neigh_suppress.sh | 862 ++++++++++++++++++
> 15 files changed, 936 insertions(+), 19 deletions(-)
> create mode 100755 tools/testing/selftests/net/test_bridge_neigh_suppress.sh
>
The set looks good to me, nicely split and pretty straight-forward.
For the set:
Acked-by: Nikolay Aleksandrov <razor@...ckwall.org>
Powered by blists - more mailing lists