lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170131084134.405043f8@xeon-e3>
Date:   Tue, 31 Jan 2017 08:41:34 -0800
From:   Stephen Hemminger <stephen@...workplumber.org>
To:     Roopa Prabhu <roopa@...ulusnetworks.com>
Cc:     netdev@...r.kernel.org, davem@...emloft.net,
        nikolay@...ulusnetworks.com, tgraf@...g.ch,
        hannes@...essinduktion.org, jbenc@...hat.com, pshelar@....org,
        dsa@...ulusnetworks.com, hadi@...atatu.com
Subject: Re: [PATCH net-next 0/5] bridge: per vlan dst_metadata support

On Mon, 30 Jan 2017 21:57:10 -0800
Roopa Prabhu <roopa@...ulusnetworks.com> wrote:

> From: Roopa Prabhu <roopa@...ulusnetworks.com>
> 
> High level summary:
> lwt and dst_metadata have enabled vxlan l3 deployments
> to use a single vxlan netdev for multiple vnis eliminating the scalability
> problem with using a single vxlan netdev per vni. This series tries to
> do the same for vxlan netdevs in pure l2 bridged networks.
> Use-case/deployment and details are below.
> 
> Deployment scerario details:
> As we know VXLAN is used to build layer 2 virtual networks across the
> underlay layer3 infrastructure. A VXLAN tunnel endpoint (VTEP)
> originates and terminates VXLAN tunnels. And a VTEP can be a TOR switch
> or a vswitch in the hypervisor. This patch series mainly
> focuses on the TOR switch configured as a Vtep. Vxlan segment ID (vni)
> along with vlan id is used to identify layer 2 segments in a vxlan
> overlay network. Vxlan bridging is the function provided by Vteps to terminate
> vxlan tunnels and map the vxlan vni to traditional end host vlan. This is
> covered in the "VXLAN Deployment Scenarios" in sections 6 and 6.1 in RFC 7348.
> To provide vxlan bridging function, a vtep has to map vlan to a vni. The rfc
> says that the ingress VTEP device shall remove the IEEE 802.1Q VLAN tag in
> the original Layer 2 packet if there is one before encapsulating the packet
> into the VXLAN format to transmit it through the underlay network. The remote
> VTEP devices have information about the VLAN in which the packet will be
> placed based on their own VLAN-to-VXLAN VNI mapping configurations.
> 
> Existing solution:
> Without this patch series one can deploy such a vtep configuration by
> adding the local ports and vxlan netdevs into a vlan filtering bridge.
> The local ports are configured as trunk ports carrying all vlans.
> A vxlan netdev per vni is added to the bridge. Vlan mapping to vni is
> achieved by configuring the vlan as pvid on the corresponding vxlan netdev.
> The vxlan netdev only receives traffic corresponding to the vlan it is mapped
> to. This configuration maps traffic belonging to a vlan to the corresponding
> vxlan segment.
> 
>           -----------------------------------
>          |              bridge               |
>          |                                   |
>           -----------------------------------
>             |100,200       |100 (pvid)    |200 (pvid)
>             |              |              |
>            swp1          vxlan1000      vxlan2000
>                     
> This provides the required vxlan bridging function but poses a
> scalability problem with using a separate vxlan netdev for each vni.
> 
> Solution in this patch series:
> The Goal is to use a single vxlan device to carry all vnis similar
> to the vxlan collect metadata mode but additionally allowing the bridge
> and vxlan driver to carry all the forwarding information and also learn.
> This implementation uses the existing dst_metadata infrastructure to map
> vlan to a tunnel id.
> - vxlan driver changes:
>     - enable collect metadata mode to be used with learning,
>       replication and fdb
>     - A single fdb table hashed by (mac, vni)
>     - rx path already has the vni
>     - tx path expects a vni in the packet with dst_metadata and relies
>       on learnt or static forwarding information table to forward the packet
> 
> - Bridge driver changes: per vlan dst_metadata support:
>     - Our use case is vxlan and 1-1 mapping between vlan and vni, but I have
>       kept the api generic for any tunnel info
>     - Uapi to configure/unconfigure/dump per vlan tunnel data
>     - new bridge port flag to turn this feature on/off. off by default
>     - ingress hook:
>         - if port is a tunnel port, use tunnel info in
>           attached dst_metadata to map it to a local vlan
>     - egress hook:
>         - if port is a tunnel port, use tunnel info attached to vlan
>           to set dst_metadata on the skb
> 
> Other approaches tried and vetoed:
> - tc vlan push/pop and tunnel metadata dst:
>     - though tc can be used to do part of this, these patches address a deployment
>       case where bridge driver vlan filtering and forwarding information
>       database along with vxlan driver forwarding information table and learning
>       are required.
> - making vxlan driver understand vlan-vni mapping:
>     - I had a series almost ready with this one but soon realized
>       it duplicated a lot of vlan handling code in the vxlan driver
> 
> Roopa Prabhu (5):
>   ip_tunnels: new IP_TUNNEL_INFO_BRIDGE flag for ip_tunnel_info mode
>   vxlan: support fdb and learning in COLLECT_METADATA mode
>   bridge: uapi: add per vlan tunnel info
>   bridge: per vlan dst_metadata netlink support
>   bridge: vlan dst_metadata hooks in ingress and egress paths
> 
>  drivers/net/vxlan.c            |  211 +++++++++++++++++-----------
>  include/linux/if_bridge.h      |    1 +
>  include/net/ip_tunnels.h       |    1 +
>  include/uapi/linux/if_bridge.h |   11 ++
>  include/uapi/linux/if_link.h   |    1 +
>  include/uapi/linux/neighbour.h |    1 +
>  net/bridge/Makefile            |    5 +-
>  net/bridge/br_forward.c        |    2 +-
>  net/bridge/br_input.c          |    8 +-
>  net/bridge/br_netlink.c        |  140 +++++++++++++------
>  net/bridge/br_netlink_tunnel.c |  296 ++++++++++++++++++++++++++++++++++++++++
>  net/bridge/br_private.h        |   12 ++
>  net/bridge/br_private_tunnel.h |   47 +++++++
>  net/bridge/br_vlan.c           |   24 +++-
>  net/bridge/br_vlan_tunnel.c    |  203 +++++++++++++++++++++++++++
>  15 files changed, 837 insertions(+), 126 deletions(-)
>  create mode 100644 net/bridge/br_netlink_tunnel.c
>  create mode 100644 net/bridge/br_private_tunnel.h
>  create mode 100644 net/bridge/br_vlan_tunnel.c
> 

I still think such complexity should be done with OVS where the architecture
is much more flexible. Rather than adding lots more special case hacks into
bridge.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ