[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5CE212DA-0C5E-47B7-9612-4BE6EA224488@drivenets.com>
Date: Wed, 21 Jul 2021 20:18:56 +0000
From: Gilad Naaman <gnaaman@...venets.com>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Possible GRE-TAP/ECT bug
This can be reproduced by using this:
$ cat setup_bug.sh
#!/bin/bash
ip netns add A
ip netns add B
ip -n A link add _v0 type veth peer name _v1 netns B
ip -n A link set _v0 up
ip -n A addr add dev _v0 10.254.3.1/24
ip -n A route add default dev _v0 scope global
ip -n B link set _v1 up
ip -n B addr add dev _v1 10.254.1.6/24
ip -n B route add default dev _v1 scope global
ip -n B link add florp type gretap local 10.254.1.6 remote 10.254.3.1 key 0x49000000
ip -n B link set florp up
$ cat send_pkt.py
#!/usr/bin/env python3
from scapy.all import *
pkt = IP(b'E\x01\x00\xa7\x00\x00\x00\x00@/`%\n\xfe\x03\x01\n\xfe\x01\x06 \x00eXI\x00\x00\x00\x18\xbe\x92\xa0\xee&\x18\xb0\x92\xa0l&\x08\x00E\x00\x00}\x8b\x85@\x00\x01\x01\xe4\xf2\x82\x82\x82\x01\x82\x82\x82\x02\x08\x00d\x11\xa6\xeb3\x1e\x1e\\xf3\\xf7`\x00\x00\x00\x00ZN\x00\x00\x00\x00\x00\x00\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@...DEFGHIJKLMNOPQRSTUVWXYZ')
send(pkt)
And then
$ sudo ip netns exec B tcpdump -neqllvi florp icmp &
$ sudo ip netns exec A python3 send_pkt.py
> On 21 Jul 2021, at 20:36, Gilad Naaman <gnaaman@...venets.com> wrote:
>
> Hey,
>
> We’ve encountered a bug with handling of the ECT bits in GRE-tap encapsulated IPv4 traffic on Kernel 5.4.
>
> This bug manifests when a packet in the following form enters the system:
>
> +-----------+
> |Inner IPv4 | ECT=0b00
> +-----------+
> |Ethernet |
> +-----------+
> |GRE |
> +-----------+
> |Outer IPv4 | ECT=0b01 | ECT=0b10
> +-----------+
>
> When inspected by tcpdump-ing the outer interface, the packet is fine;
> when inspected by tcpdump-ing the GRE interface, a small amount of bits is changed in the source/destination addresses of the inner Ethernet header.
>
> Upon further inspection, we found out that the bits changed in the ethernet headers correspond to the position of the ECT/Checksum fields in the IPv4 header.
> In other words, some code path tries to copy the ECT bits from outer to the inner IPv4 header and correct the checksum, but it accidentally treats the ethernet header as an IP header.
>
> Unfortunately this results in the loss of all IPv4 traffic.
>
> This doesn’t happen if the internal protocol is IPv6 or IS-IS, or if the external ECT is 0.
>
> We’ve tried to search for places that modify the `tos` field of an skb, and found just enough to say “yes, some code wants to sync the ECT bits”,
> but not to find the right one.
>
> Most prominently, xfrm does just that, but as far as I can tell it is not relevant to these packets.
> (It is done by calling `ipv4_copy_dscp` from `xfrm4_remove_tunnel_encap`)
>
> We’ve gone over the ip_tunnel.c code dmesg shows that `ip_tunnel: non-ECT from 10.254.3.1 with TOS=0x1` is printed for the packet mentioned below.
>
> Would love if someone can direct me at where to look for this ECT copying code for further analysis.
>
> Thank you!
>
>
> =================================================
>
> As an example to a modified packet, here’s an ICMP ping.
> The sent packet is
>
> ```
> <IP version=4 ihl=5 tos=0x1 id=0 flags= frag=0 ttl=64 proto=gre src=10.254.3.1 dst=10.254.1.6 |<GRE chksum_present=0 routing_present=0 key_present=1 seqnum_present=0 strict_route_source=0 recursion_control=0 flags=0 version=0 proto=TEB key=0x49000000 |<Ether dst=18:be:92:a0:ee:26 src=18:b0:92:a0:6c:26 type=IPv4 |<IP version=4 ihl=5 tos=0x0 id=35717 flags=DF frag=0 ttl=1 proto=icmp src=130.130.130.1 dst=130.130.130.2 |<ICMP type=echo-request code=0 id=0xa6eb seq=0x331e unused='' |<Raw load='\x1e\\xf3\\xf7`\x00\x00\x00\x00ZN\x00\x00\x00\x00\x00\x00\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@...DEFGHIJKLMNOPQRSTUVWXYZ' |>>>>>>
> ```
> or in hex:
> 450100a700000000402f60250afe03010afe0106200065584900000018be92a0ee2618b092a06c2608004500007d8b8540000101e4f2828282018282820208006411a6eb331e1e5c7866335c78663760000000005a4e000000000000101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a
>
>
>
> The received packet is
> ```
> <Ether dst=18:bd:92:a0:ee:26 src=18:b0:92:a0:6c:27 type=IPv4 |<IP version=4 ihl=5 tos=0x0 len=125 id=35717 flags=DF frag=0 ttl=1 proto=icmp chksum=0xe4f2 src=130.130.130.1 dst=130.130.130.2 |<ICMP type=echo-request code=0 chksum=0x6411 id=0xa6eb seq=0x331e unused='' |<Raw load='\x1e\\xf3\\xf7`\x00\x00\x00\x00ZN\x00\x00\x00\x00\x00\x00\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@...DEFGHIJKLMNOPQRSTUVWXYZ' |>>>>
> ```
> or
> 18bd92a0ee2618b092a06c2708004500007d8b8540000101e4f2828282018282820208006411a6eb331e1e5c7866335c78663760000000005a4e000000000000101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a
>
>
> Note how the destination address changes from `18:be:92:a0:ee:26` to `18:bd:92:a0:ee:26`.
> Where the lower 2 bits of the second octet are overwritten with the ECT `0b01`.
>
> The source address is then changed from `18:b0:92:a0:6c:26` to `18:b0:92:a0:6c:27`, probably as an attempt to fix the checksum.
> In practice it’s always incremented by 1, even though I’m not sure it results in a correct checksum.
>
> If the MAC address is changed on the interface to match the ECT, or the other ECT is used in the packet, nothing changes and the packet is received correctly.
Powered by blists - more mailing lists