netdev - Re: [PATCH net-next 2/5] ipv6: IOAM tunnel decapsulation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALx6S37wpA5Mc7jdUk8_sR_fJTc-zRpvY8VkDV=NoWdvDhKfpg@mail.gmail.com>
Date:   Thu, 25 Jun 2020 17:48:43 -0700
From:   Tom Herbert <tom@...bertland.com>
To:     Justin Iurman <justin.iurman@...ege.be>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH net-next 2/5] ipv6: IOAM tunnel decapsulation

On Thu, Jun 25, 2020 at 10:56 AM Justin Iurman <justin.iurman@...ege.be> wrote:
>
> >> Implement the IOAM egress behavior.
> >>
> >> According to RFC 8200:
> >> "Extension headers (except for the Hop-by-Hop Options header) are not
> >>  processed, inserted, or deleted by any node along a packet's delivery
> >>  path, until the packet reaches the node (or each of the set of nodes,
> >>  in the case of multicast) identified in the Destination Address field
> >>  of the IPv6 header."
> >>
> >> Therefore, an ingress node (an IOAM domain border) must encapsulate an
> >> incoming IPv6 packet with another similar IPv6 header that will contain
> >> IOAM data while it traverses the domain. When leaving, the egress node,
> >> another IOAM domain border which is also the tunnel destination, must
> >> decapsulate the packet.
> >
> > This is just IP in IP encapsulation that happens to be terminated at
> > an egress node of the IOAM domain. The fact that it's IOAM isn't
> > germaine, this IP in IP is done in a variety of ways. We should be
> > using the normal protocol handler for NEXTHDR_IPV6  instead of special
> > case code.
>
> Agree. The reason for this special case code is that I was not aware of a more elegant solution.
>
The current implementation might not be what you're looking for since
ip6ip6 wants a tunnel configured. What we really want is more like
anonymous decapsulation, that is just decap the ip6ip6 packet and
resubmit the packet into the stack (this is what you patch is doing).
The idea has been kicked around before, especially in the use case
where we're tunneling across a domain and there could be hundreds of
such tunnels to some device. I think it's generally okay to do this,
although someone might raise security concerns since it sort of
obfuscates the "real packet". Probably makes sense to have a sysctl to
enable this and probably could default to on. Of course, if we do this
the next question is should we also implement anonymous decapsulation
for 44,64,46 tunnels.

Tom

> Justin
>
> >> Signed-off-by: Justin Iurman <justin.iurman@...ege.be>
> >> ---
> >>  include/linux/ipv6.h |  1 +
> >>  net/ipv6/ip6_input.c | 22 ++++++++++++++++++++++
> >>  2 files changed, 23 insertions(+)
> >>
> >> diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
> >> index 2cb445a8fc9e..5312a718bc7a 100644
> >> --- a/include/linux/ipv6.h
> >> +++ b/include/linux/ipv6.h
> >> @@ -138,6 +138,7 @@ struct inet6_skb_parm {
> >>  #define IP6SKB_HOPBYHOP        32
> >>  #define IP6SKB_L3SLAVE         64
> >>  #define IP6SKB_JUMBOGRAM      128
> >> +#define IP6SKB_IOAM           256
> >>  };
> >>
> >>  #if defined(CONFIG_NET_L3_MASTER_DEV)
> >> diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
> >> index e96304d8a4a7..8cf75cc5e806 100644
> >> --- a/net/ipv6/ip6_input.c
> >> +++ b/net/ipv6/ip6_input.c
> >> @@ -361,9 +361,11 @@ INDIRECT_CALLABLE_DECLARE(int tcp_v6_rcv(struct sk_buff
> >> *));
> >>  void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
> >>                               bool have_final)
> >>  {
> >> +       struct inet6_skb_parm *opt = IP6CB(skb);
> >>         const struct inet6_protocol *ipprot;
> >>         struct inet6_dev *idev;
> >>         unsigned int nhoff;
> >> +       u8 hop_limit;
> >>         bool raw;
> >>
> >>         /*
> >> @@ -450,6 +452,25 @@ void ip6_protocol_deliver_rcu(struct net *net, struct
> >> sk_buff *skb, int nexthdr,
> >>         } else {
> >>                 if (!raw) {
> >>                         if (xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) {
> >> +                               /* IOAM Tunnel Decapsulation
> >> +                                * Packet is going to re-enter the stack
> >> +                                */
> >> +                               if (nexthdr == NEXTHDR_IPV6 &&
> >> +                                   (opt->flags & IP6SKB_IOAM)) {
> >> +                                       hop_limit = ipv6_hdr(skb)->hop_limit;
> >> +
> >> +                                       skb_reset_network_header(skb);
> >> +                                       skb_reset_transport_header(skb);
> >> +                                       skb->encapsulation = 0;
> >> +
> >> +                                       ipv6_hdr(skb)->hop_limit = hop_limit;
> >> +                                       __skb_tunnel_rx(skb, skb->dev,
> >> +                                                       dev_net(skb->dev));
> >> +
> >> +                                       netif_rx(skb);
> >> +                                       goto out;
> >> +                               }
> >> +
> >>                                 __IP6_INC_STATS(net, idev,
> >>                                                 IPSTATS_MIB_INUNKNOWNPROTOS);
> >>                                 icmpv6_send(skb, ICMPV6_PARAMPROB,
> >> @@ -461,6 +482,7 @@ void ip6_protocol_deliver_rcu(struct net *net, struct
> >> sk_buff *skb, int nexthdr,
> >>                         consume_skb(skb);
> >>                 }
> >>         }
> >> +out:
> >>         return;
> >>
> >>  discard:
> >> --
> >> 2.17.1