netdev - Re: [PATCH net] af_packet: fix raw sockets over 6in4 tunnel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAF=yD-KBLi4vrnBjS1JBzEvBgED7hWt_mv0chynWZDrQ0ZR=vw@mail.gmail.com>
Date:   Sat, 12 Jan 2019 13:00:24 -0500
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Nicolas Dichtel <nicolas.dichtel@...nd.com>
Cc:     David Miller <davem@...emloft.net>,
        Network Development <netdev@...r.kernel.org>
Subject: Re: [PATCH net] af_packet: fix raw sockets over 6in4 tunnel

On Fri, Jan 11, 2019 at 4:29 PM Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
>
> On Fri, Jan 11, 2019 at 9:44 AM Nicolas Dichtel
> <nicolas.dichtel@...nd.com> wrote:
> >
> > Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in
> > SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel:
> >
> > Here is a example of the setup:
> > $ ip link set ntfp2 up
> > $ ip addr add 10.125.0.1/24 dev ntfp2
> > $ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 dev ntfp2
> > $ ip addr add fd00:cafe:cafe::1/128 dev tun1
> > $ ip link set dev tun1 up
> > $ ip route add fd00:200::/64 dev tun1
> > $ scapy
> > >>> p = []
> > >>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
> > >>> send(p, count=1, inter=0.1)
> > >>> quit()
> > $ ip -s link ls dev tun1 | grep -A1 "TX.*errors"
> >     TX: bytes  packets  errors  dropped carrier collsns
> >     0          0        1       0       0       0
> >
> > The problem is that the network offset is set to the hard_header_len of the
> > output device (tun1, ie 14 + 20) and in our case, because the packet is
> > small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 bytes
> > (ipv6 header) starting from the network offset). Let's reset the network
> > offset so that it points to the beginning of the L3 data, ie skb->data.
>
> This only just landed in my inbox, sorry for the delay. So far I'm not
> reproducing the issue, but I'm trying and am having a look.
>
> Which pskb_inet_may_pull() is failing?

Okay. I see what's going on.

Commit cb9f1b783850 converted a pskb_may_pull ipv6hdr, pulling from
skb->data, to pskb_network_may_pull, pulling from skb->network_header.

Normally packet sockets with SOCK_RAW write a fixed size hardware
header and place skb->network_header directly after that. Dropping
these in sit_tunnel_xmit if they fail pskb_inet_may_pull is correct.

But with devices with variable length hardware headers like sit, it is
possible to cook packets that are shorter than the upper bound hhlen.
In that case, we have no way of knowing where the variable length
header ends (short of device specific parsing if implemented, see also
header_ops.validate), so we set the network header to the same as the
mac header as of commit 993675a3100b1.

This is not a good thing to do in general, and not needed in the
common case.

The issue here is that the icmpv6 packet (48B) exceeds the sit hhlen
(34), so is not caught by that workaround. We need to extend it to
cases where the variable length hhlen is less than its upper bound
in such a way that the packet as a whole exceeds hhlen, but by so
little that the L3 header still starts before dev->hard_header_len.

The simplest, to handle ip validation, is to extend the check from

  if (len < reserve)
    skb_reset_network_header(skb);

to one that accounts for requiring a fully formed protocol header

        } else if (reserve) {
+               int l3_min_len = reserve;
+
                skb_reserve(skb, -reserve);
-               if (len < reserve)
+
+               if (proto == htons(ETH_P_IPV6))
+                       l3_min_len += sizeof(struct ipv6hdr);
+               else if (proto == htons(ETH_P_IP))
+                       l3_min_len += sizeof(struct iphdr);
+
+               if (len < l3_min_len)
                        skb_reset_network_header(skb);
        }

Instead of special casing protocols, it might be cleaner to convert
dev_validate_header to return instead of a pass/fail bool the hardware
header length. And then update skb->network_header accordingly. That
would definitely only be for net-next.

For completeness, my variant of your test on top of v5.0-rc1

"
ip netns add test
ip netns exec test bash

ip link set lo up
ip addr add 10.0.0.1/24 dev lo
ip tunnel add tun1 mode sit ttl 64 local 10.0.0.1 remote 10.0.0.2 dev lo
ip addr add fd00:cafe:cafe::1/128 dev tun1
ip link set dev tun1 up
ip route add fd00:200::/64 dev tun1

tcpdump -vvv -i tun1 -n &
sleep 0.4

scapy
p = []
p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
send(p, count=1, inter=0.1)
quit()
"