[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20180112211118.GE740@surrealistic.net>
Date: Fri, 12 Jan 2018 13:11:18 -0800
From: Jim Westfall <jwestfall@...realistic.net>
To: netdev@...r.kernel.org
Subject: Re: NOARP devices and NOARP arp_cache entires
Jim Westfall <jwestfall@...realistic.net> wrote [01.11.18]:
> Hi
>
> I'm seeing some weird behavior related to NOARP devices and NOARP
> entries in the arp cache.
>
> I have a couple gre tunnels between a linux box and a upstream router that
> send/recv a large amount of packets with unique ips. On the order of 10k+
> unique ips per second seen by the linux box.
>
> Each one of the ips ends up getting added to the arp cache as
>
> <ip> dev tun1234 lladdr 0.0.0.0 NOARP
>
> This of course makes the arp cache grow extremely fast and overflow.
> While I can tweak gc_thresh1/2/3 to make arp cache size huge, it doesn't
> seem like the right answer as the kernel is spinning its wheels having to
> adding/expunging entries for the high rate of unique ips.
>
> I'm unclear why the kernel is even tracking them in the arp cache. If
> routing table says to route the packet out a NOARP interface then there is
> no arp, why involve the arp cache at all?
>
> You can see the behavior with the following
>
> [root@...stfall:~]# uname -a
> Linux jwestfall.jwestfall.net 4.14.10_1 #1 SMP PREEMPT Sun Dec 31 20:23:29 UTC 2017 x86_64 GNU/Linux
>
> [root@...stfall:~]# ip neigh show nud noarp
> 10.0.0.172 dev lo lladdr 00:00:00:00:00:00 NOARP
> 10.70.50.5 dev tun0 lladdr 08 NOARP
> 127.0.0.1 dev lo lladdr 00:00:00:00:00:00 NOARP
>
> Setup a bogus gre tunnel, the remote ip doesn't matter
> [root@...stfall:~]# ip tunnel add tun1234 mode gre local 10.0.0.172 remote 10.0.0.156 dev enp4s0
> [root@...stfall:~]# ip link set up dev tun1234
>
> Route a bogus network to the tunnel
> [root@...stfall:~]# ip route add 192.168.111.0/24 dev tun1234
>
> Ping ips on the bogus network
> [root@...stfall:~]# nmap -sP 192.168.111.0/24
>
> Starting Nmap 7.60 ( https://nmap.org ) at 2018-01-11 12:06 PST
> ...
>
> [root@...stfall:~]# ip neigh show nud noarp
> 192.168.111.18 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.4 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.28 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.17 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.14 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.34 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.3 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.20 dev tun1234 lladdr 0.0.0.0 NOARP
> 10.0.0.172 dev lo lladdr 00:00:00:00:00:00 NOARP
> 192.168.111.6 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.27 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.13 dev tun1234 lladdr 0.0.0.0 NOARP
> 192.168.111.33 dev tun1234 lladdr 0.0.0.0 NOARP
> ...
>
> Also somewhat interesting is that on older kernels (3.2 time range) these
> NOARP entries didn't get added for ipv4, but they did for ipv6 if you
> pushed ipv6 through the ipv4 tunnel.
>
> 2804:14c:f281:a1d8:61a2:a30:989f:3eb1 dev tun1 lladdr 0.0.0.0 NOARP
> 2607:8400:2122:4:e9f9:dbb8:2d44:75d1 dev tun2 lladdr 0.0.0.0 NOARP
>
> Thanks
> Jim Westfall
>
>
Digging into this a bit in older kernels there was the following
static struct neighbour *ipv4_neigh_lookup(const struct dst_entry *dst, const void *daddr)
{
static const __be32 inaddr_any = 0;
struct net_device *dev = dst->dev;
const __be32 *pkey = daddr;
struct neighbour *n;
if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT))
pkey = &inaddr_any;
which was forcing the hash key to be 0.0.0.0 for tunnels. This was removed as
part of a263b3093641fb1ec377582c90986a7fd0625184 which was part of a larger set
that "Disconnect neigh from dst_entry"
Would there be any aversion to me submitting a patch to mimic this older
behavior?
Thanks
jim
Powered by blists - more mailing lists