lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <876582ab-2b8c-7e46-7795-236c0ef6d90d@gmail.com>
Date:   Wed, 1 May 2019 17:47:44 -0600
From:   David Ahern <dsahern@...il.com>
To:     Alan Maguire <alan.maguire@...cle.com>, netdev@...r.kernel.org
Cc:     daniel@...earbox.net, Ian Kumlien <ian.kumlien@...il.com>
Subject: Re: MPLS encapsulation and arp table overflow

On 5/1/19 10:03 AM, Alan Maguire wrote:
> I'm seeing the following repeated error
> 
> [  130.821362] neighbour: arp_cache: neighbor table overflow!
> 
> when using MPLSoverGRE or MPLSoverUDP tunnels on bits synced
> with bpf-next as of this morning. The test script below reliably
> reproduces the problem, while working fine on a 4.14 (I haven't
> bisected yet). It can be run with no arguments, or specifying
> gre or udp for the specific encap type.
> 
> It seems that every MPLS-encapsulated outbound packet is attempting
> to add  a neighbor entry, and as a result we hit the 
> net.ipv4.neigh.default.gc_thresh3 limit quickly.
> 
> When this failure occurs, the arp table doesn't show any of
> these additional entries. Existing arp table entries are
> disappearing too, so perhaps they are being recycled when the
> table becomes full?
> 

There are 2 bugs:
1. neigh_xmit fails to find a neighbor entry on every single Tx. This
was introduced by:

cd9ff4de010 ("ipv4: Make neigh lookup keys for loopback/point-to-point
devices be INADDR_ANY")

Basically, the primary_key is reset to 0 for tun's but the neigh_xmit
lookup was not corrected.

That caused a new neigh entry to be created on each packet Tx, but
before inserting the new one to the table the create function looks to
see if an entry already exists. The arp constructor had reset the key to
0 in the new neighbor entry so the second lookup finds a match and the
new one is dropped.

That exposed a second bug.

2. neigh_alloc bumps the gc_entries counter when a new one is allocated,
but ___neigh_create is not dropping the counter in the error path.

Ian reported a similar problem, but we were not able to isolate the cause.

Thanks for the script - very helpful in resolving the bugs. I made some
changes to it and I plan to submit it to selftests as a starter for mpls
tests.

Bug fix patches coming.

Powered by blists - more mailing lists