lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 8 Oct 2019 12:28:23 -0400
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Martin Varghese <martinvarghesenokia@...il.com>
Cc:     Network Development <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>, corbet@....net,
        scott.drennan@...ia.com, Jiri Benc <jbenc@...hat.com>,
        martin.varghese@...ia.com
Subject: Re: [PATCH net-next 1/2] UDP tunnel encapsulation module for
 tunnelling different protocols like MPLS,IP,NSH etc.

On Tue, Oct 8, 2019 at 5:51 AM Martin Varghese
<martinvarghesenokia@...il.com> wrote:
>
> From: Martin <martin.varghese@...ia.com>
>
> The Bareudp tunnel module provides a generic L3 encapsulation
> tunnelling module for tunnelling different protocols like MPLS,
> IP,NSH etc inside a UDP tunnel.
>
> Signed-off-by: Martin Varghese <martinvarghesenokia@...il.com>
> ---
>  Documentation/networking/bareudp.txt |  23 +
>  drivers/net/Kconfig                  |  13 +
>  drivers/net/Makefile                 |   1 +
>  drivers/net/bareudp.c                | 930 +++++++++++++++++++++++++++++++++++
>  include/net/bareudp.h                |  19 +
>  include/uapi/linux/if_link.h         |  12 +
>  6 files changed, 998 insertions(+)
>  create mode 100644 Documentation/networking/bareudp.txt
>  create mode 100644 drivers/net/bareudp.c
>  create mode 100644 include/net/bareudp.h
>
> diff --git a/Documentation/networking/bareudp.txt b/Documentation/networking/bareudp.txt
> new file mode 100644
> index 0000000..d2530e2
> --- /dev/null
> +++ b/Documentation/networking/bareudp.txt
> @@ -0,0 +1,23 @@
> +Bare UDP Tunnelling Module Documentation
> +========================================
> +
> +There are various L3 encapsulation standards using UDP being discussed to
> +leverage the UDP based load balancing capability of different networks.
> +MPLSoUDP (https://tools.ietf.org/html/rfc7510)is one among them.
> +
> +The Bareudp tunnel module provides a generic L3 encapsulation tunnelling
> +support for tunnelling different L3 protocols like MPLS, IP, NSH etc. inside
> +a UDP tunnel.

This patch set introduces a lot of code, much of which duplicates
existing tunnel devices, whether FOU using ipip tunnels and
fou_build_header or separate devices like vxlan and geneve. Let's try
to reuse what's there (and tested).

How does this differ from foo-over-udp (FOU). In supporting MPLS
alongside IP? If so, can it reuse the existing code, possibly with
patches to that existing tunnel logic?

I happened to have been playing around with MPLS in GRE recently. Have
not tried the same over UDP tunnels, but most of it should apply?

  ip tunnel add gre1 mode gre local 192.168.1.1 remote 192.168.1.2 dev eth0
  ip addr add 192.168.101.1 peer 192.168.101.2 dev gre1
  ip link set dev gre1 up

  sysctl net.mpls.conf.gre1.input=1
  sysctl net.mpls.platform_labels=17
  ip addr add 192.168.201.1/24 dev gre1
  ip route add 192.168.202.0/24 encap mpls 17 dev gre1
  ip -f mpls route add 16 dev lo


> +static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
> +{
> +       struct bareudp_sock *bs;
> +       struct ethhdr *eh;
> +       struct bareudp_dev *bareudp;
> +       struct metadata_dst *tun_dst = NULL;
> +       struct pcpu_sw_netstats *stats;
> +       unsigned int len;
> +       int err = 0;
> +       void *oiph;
> +       u16 proto;
> +
> +       if (unlikely(!pskb_may_pull(skb, BAREUDP_BASE_HLEN)))
> +               goto drop;
> +
> +       bs = rcu_dereference_sk_user_data(sk);
> +       if (!bs)
> +               goto drop;
> +
> +       bareudp = bs->bareudp;
> +       proto = bareudp->ethertype;
> +
> +       if (iptunnel_pull_header(skb, BAREUDP_BASE_HLEN,
> +                                proto,
> +                                !net_eq(bareudp->net,
> +                                        dev_net(bareudp->dev)))) {
> +               bareudp->dev->stats.rx_dropped++;
> +               goto drop;
> +       }
> +       tun_dst = udp_tun_rx_dst(skb, bareudp_get_sk_family(bs), TUNNEL_KEY,
> +                                0, 0);
> +       if (!tun_dst) {
> +               bareudp->dev->stats.rx_dropped++;
> +               goto drop;
> +       }
> +       skb_dst_set(skb, &tun_dst->dst);

Is this dst metadata a firm requirement? It is optional in vxlan, say.
If here, too, please split such parts off into separate follow-on
patches.


> +       skb_push(skb, sizeof(struct ethhdr));
> +       eh = (struct ethhdr *)skb->data;
> +       eh->h_proto = proto;
> +
> +       skb_reset_mac_header(skb);
> +       skb->protocol = eth_type_trans(skb, bareudp->dev);
> +       skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
> +       oiph = skb_network_header(skb);
> +       skb_reset_network_header(skb);
> +
> +       if (bareudp_get_sk_family(bs) == AF_INET)

This should be derived from packet contents, not socket state.
Although the one implies the other, I imagine.

> +static struct rtable *bareudp_get_v4_rt(struct sk_buff *skb,
> +                                       struct net_device *dev,
> +                                       struct bareudp_sock *bs4,
> +                                       struct flowi4 *fl4,
> +                                       const struct ip_tunnel_info *info)
> +{
> +       bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
> +       struct bareudp_dev *bareudp = netdev_priv(dev);
> +       struct dst_cache *dst_cache;
> +       struct rtable *rt = NULL;
> +       __u8 tos;
> +
> +       if (!bs4)
> +               return ERR_PTR(-EIO);
> +
> +       memset(fl4, 0, sizeof(*fl4));
> +       fl4->flowi4_mark = skb->mark;
> +       fl4->flowi4_proto = IPPROTO_UDP;
> +       fl4->daddr = info->key.u.ipv4.dst;
> +       fl4->saddr = info->key.u.ipv4.src;
> +
> +       tos = info->key.tos;
> +       fl4->flowi4_tos = RT_TOS(tos);
> +
> +       dst_cache = (struct dst_cache *)&info->dst_cache;
> +       if (use_cache) {
> +               rt = dst_cache_get_ip4(dst_cache, &fl4->saddr);
> +               if (rt)
> +                       return rt;
> +       }
> +       rt = ip_route_output_key(bareudp->net, fl4);
> +       if (IS_ERR(rt)) {
> +               netdev_dbg(dev, "no route to %pI4\n", &fl4->daddr);
> +               return ERR_PTR(-ENETUNREACH);
> +       }
> +       if (rt->dst.dev == dev) { /* is this necessary? */
> +               netdev_dbg(dev, "circular route to %pI4\n", &fl4->daddr);
> +               ip_rt_put(rt);
> +               return ERR_PTR(-ELOOP);
> +       }
> +       if (use_cache)
> +               dst_cache_set_ip4(dst_cache, &rt->dst, fl4->saddr);
> +       return rt;
> +}
> +
> +#if IS_ENABLED(CONFIG_IPV6)
> +static struct dst_entry *bareudp_get_v6_dst(struct sk_buff *skb,
> +                                           struct net_device *dev,
> +                                           struct bareudp_sock *bs6,
> +                                           struct flowi6 *fl6,
> +                                           const struct ip_tunnel_info *info)
> +{
> +       bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
> +       struct bareudp_dev *bareudp = netdev_priv(dev);
> +       struct dst_entry *dst = NULL;
> +       struct dst_cache *dst_cache;
> +       __u8 prio;
> +
> +       if (!bs6)
> +               return ERR_PTR(-EIO);
> +
> +       memset(fl6, 0, sizeof(*fl6));
> +       fl6->flowi6_mark = skb->mark;
> +       fl6->flowi6_proto = IPPROTO_UDP;
> +       fl6->daddr = info->key.u.ipv6.dst;
> +       fl6->saddr = info->key.u.ipv6.src;
> +       prio = info->key.tos;
> +
> +       fl6->flowlabel = ip6_make_flowinfo(RT_TOS(prio),
> +                                          info->key.label);
> +       dst_cache = (struct dst_cache *)&info->dst_cache;
> +       if (use_cache) {
> +               dst = dst_cache_get_ip6(dst_cache, &fl6->saddr);
> +               if (dst)
> +                       return dst;
> +       }
> +       if (ipv6_stub->ipv6_dst_lookup(bareudp->net, bs6->sock->sk, &dst,
> +                                      fl6)) {
> +               netdev_dbg(dev, "no route to %pI6\n", &fl6->daddr);
> +               return ERR_PTR(-ENETUNREACH);
> +       }
> +       if (dst->dev == dev) { /* is this necessary? */
> +               netdev_dbg(dev, "circular route to %pI6\n", &fl6->daddr);
> +               dst_release(dst);
> +               return ERR_PTR(-ELOOP);
> +       }
> +
> +       if (use_cache)
> +               dst_cache_set_ip6(dst_cache, dst, &fl6->saddr);
> +       return dst;
> +}
> +#endif

The route lookup logic is very similar to vxlan_get_route and
vxlan6_get_route. Can be reused?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ