[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACcUnf_HsNiG5aDdpOCuXY436PY297Yof0MFfp7eVhKgPgUn_A@mail.gmail.com>
Date: Tue, 9 Oct 2018 11:58:33 -0400
From: Josh Coombs <jcoombs@...ff.gwi.net>
To: netdev@...r.kernel.org
Subject: Possible bug in traffic control?
Hello all, I'm looking for some guidance in chasing what I believe to
be a bug in kernel traffic control filters. If I'm pinging the wrong
list let me know.
I have a homebrew MACSec bridge setup using two pairs of PCs. I
establish a MACSec link between them, and then use TC to bridge a
second ethernet interface over the MACSec link. The second interface
is connected to a Juniper switch at each end, and I'm using LACP over
the links to bond them up for redundancy. It turns out I need that
redundancy as after awhile one pair of bridges will stop flowing
packets in one direction. I've since replicated this failure with a
group of VMs as well.
My test setup to replicate the failure inside ESXi:
- Two MACSec bridge VMs, A and Z
- Two IPerf VMs, A and Z
My VMs are currently built using Ubuntu Server 18.04 to be quick, no
additional packages are required outside of iperf3. Kernel ver as
shipped currently is 4.15.0-36. I highly advise using a CPU with AES
instruction support as MACSec eats CPU without it and will take longer
to reproduce the symptoms.
- A 'MACSec Bridge' network
- A 'A Side link' network
- A 'Z Side link' network
In ESXi I used a dedicated vSwitch, 9000 MTU (to allow full 1500 eth
packets + MACSec to pass on the bridge) and the security policy is
full open (allow promiscuous, allow forged, allow mac changes) as
we're abusing the networks as direct point to point links. If using
physical machines, just cable up, my example script bumps the MTU as
required.
The MACSec boxes have two ethernet interfaces each. One pair is on
the MACSec Bridge network. The other interfaces go to the A and Z
IPerf boxes respectively via their dedicated networks. A and Z need
their interfaces configured with IPs in a common subnet, such as
192.168.0.1/30 and 192.168.0.2/30.
My script sets up MACSec, tweaks MTUs, and touches a few sysctls to
turn the involved interfaces into silent actors. It then uses TC to
start the actual bridging. From there I've been firing up iperf 3
sessions in both directions between A and Z to hammer the bridge until
it fails. When it does, I can see packets stop being bridged in one
direction on one MACSec host, but not the other. The second host
continues to flow packets in both directions. Nothing is logged to
dmesg when this fault occurs. The fault seems to occur at roughly the
same packet / traffic amount each time. On my main application it's
after approximately 2.5TB of traffic (random mix of sizes) and with my
test bed it was after 5.5TB of 1500 byte packets.
On the impacted MACSec node, watching interface packet counters via
ifconfig and actual traffic with tcpdump I can see packets coming in
MACSec and going out the host interface, the host reply coming in but
not showing up on the MACSec interface to cross the bridge. Clearing
out the tc filter and qdisc and re-adding does not restore traffic
flow.
There is a PPA with 4.18 available for Ubuntu that I'm going to test
with next to see if that makes a difference in behavior. In the mean
time I'd appreciate any suggestions on how to diagnose this.
My MACSec bridge setup script, update sif, dif, the keys and rxmac to
match your setup. The rxmac is the mac addy of the remote bridge
interface. Keys need to be flipped between systems.
-----------------------
#!/bin/bash
# Interfaces:
# sif = Ingress physical interface (Source)
# dif = Egress physical interface (Dest)
# eif = Encrypted interface
sif=eno2
dif=enp1s0f0
eif=macsec0
# MACSec Keys:
# txkey = Transmit (Local) key
# rxkey = Receive (Remote) key
# rxmac = Receive (Remote) MAC addy
txkey=00000000000000000000000000000000
rxkey=99999999999999999999999999999999
rxmac=00:11:22:33:44:55
# Use jumbo frames for macsec to allow full 1500 MTU passthrough:
echo "* MTU update"
ip link set "$sif" mtu 9000
ip link set "$dif" mtu 9000
# Bring up macsec:
echo "* Enable MACSec"
modprobe macsec
ip link add link "$dif" "$eif" type macsec
ip macsec add "$eif" tx sa 0 pn 1 on key 02 "$txkey"
ip macsec add "$eif" rx address "$rxmac" port 1
ip macsec add "$eif" rx address "$rxmac" port 1 sa 0 pn 1 on key 01 "$rxkey"
ip link set "$eif" type macsec encrypt on
#ip link set "$eif" type macsec replay on window 64
# Keep system from trying to respond to observed traffic:
echo "* Clamp the system so bridge ports NEVER respond to traffic"
sysctl -w net.ipv4.conf.default.arp_filter=1
sysctl -w net.ipv4.conf.all.arp_filter=1
ip link set "$sif" down promisc on arp off multicast off
sysctl -w net.ipv6.conf."$sif".autoconf=0
sysctl -w net.ipv6.conf."$sif".accept_ra=0
sysctl -w net.ipv4.conf."$sif".arp_ignore=8
sysctl -w net.ipv4.conf."$sif".rp_filter=0
ip link set "$dif" down promisc on arp off multicast off
sysctl -w net.ipv6.conf."$dif".autoconf=0
sysctl -w net.ipv6.conf."$dif".accept_ra=0
sysctl -w net.ipv4.conf."$dif".arp_ignore=8
sysctl -w net.ipv4.conf."$dif".rp_filter=0
ip link set "$eif" down promisc on arp off multicast off
sysctl -w net.ipv6.conf."$eif".autoconf=0
sysctl -w net.ipv6.conf."$eif".accept_ra=0
sysctl -w net.ipv4.conf."$eif".arp_ignore=8
sysctl -w net.ipv4.conf."$eif".rp_filter=0
# Set up traffic mirroring:
echo "* Start Port Mirror"
# sif to eif
tc qdisc add dev "$sif" ingress
tc filter add dev "$sif" parent ffff: \
protocol all \
u32 match u8 0 0 \
action mirred egress mirror dev "$eif"
# eif to sif
tc qdisc add dev "$eif" ingress
tc filter add dev "$eif" parent ffff: \
protocol all \
u32 match u8 0 0 \
action mirred egress mirror dev "$sif"
# Bring up the interfaces:
echo "* Light tunnel NICS"
ip link set "$sif" up
ip link set "$dif" up
ip link set "$eif" up
echo " --=[ MACSec Up ]=--"
-----------------------
Josh Coombs
Powered by blists - more mailing lists