lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <alpine.LRH.2.20.1905021055490.5146@dhcp-10-175-160-224.vpn.oracle.com> Date: Thu, 2 May 2019 11:00:07 +0100 (BST) From: Alan Maguire <alan.maguire@...cle.com> To: David Ahern <dsahern@...il.com> cc: Alan Maguire <alan.maguire@...cle.com>, netdev@...r.kernel.org, daniel@...earbox.net, Ian Kumlien <ian.kumlien@...il.com> Subject: Re: MPLS encapsulation and arp table overflow On Wed, 1 May 2019, David Ahern wrote: > On 5/1/19 10:03 AM, Alan Maguire wrote: > > I'm seeing the following repeated error > > > > [ 130.821362] neighbour: arp_cache: neighbor table overflow! > > > > when using MPLSoverGRE or MPLSoverUDP tunnels on bits synced > > with bpf-next as of this morning. The test script below reliably > > reproduces the problem, while working fine on a 4.14 (I haven't > > bisected yet). It can be run with no arguments, or specifying > > gre or udp for the specific encap type. > > > > It seems that every MPLS-encapsulated outbound packet is attempting > > to add a neighbor entry, and as a result we hit the > > net.ipv4.neigh.default.gc_thresh3 limit quickly. > > > > When this failure occurs, the arp table doesn't show any of > > these additional entries. Existing arp table entries are > > disappearing too, so perhaps they are being recycled when the > > table becomes full? > > > > There are 2 bugs: > 1. neigh_xmit fails to find a neighbor entry on every single Tx. This > was introduced by: > > cd9ff4de010 ("ipv4: Make neigh lookup keys for loopback/point-to-point > devices be INADDR_ANY") > > Basically, the primary_key is reset to 0 for tun's but the neigh_xmit > lookup was not corrected. > > That caused a new neigh entry to be created on each packet Tx, but > before inserting the new one to the table the create function looks to > see if an entry already exists. The arp constructor had reset the key to > 0 in the new neighbor entry so the second lookup finds a match and the > new one is dropped. > > That exposed a second bug. > > 2. neigh_alloc bumps the gc_entries counter when a new one is allocated, > but ___neigh_create is not dropping the counter in the error path. > > Ian reported a similar problem, but we were not able to isolate the cause. > Fantastic, thanks so much for the quick fixes! I verified them at my end, ensuring that with the patches applied to the latest net tree, the previously-failing test succeeds. > Thanks for the script - very helpful in resolving the bugs. I made some > changes to it and I plan to submit it to selftests as a starter for mpls > tests. > Sounds great! It's mostly cobbled together from Willem's bpf test_tc_tunnel.sh script, so like that could probably be generalized to cover more tunnel types too. Thanks again! Alan > Bug fix patches coming. >
Powered by blists - more mailing lists