netdev - Re: [PATCH] IPv6: Fix CPU contention on FIB6 GC

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <04671504-5ee4-7cfa-7b28-9e7f15c9f607@gmail.com>
Date:   Wed, 24 Jun 2020 09:42:23 -0700
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Oliver Herms <oliver.peter.herms@...il.com>,
        Michal Kubecek <mkubecek@...e.cz>
Cc:     Eric Dumazet <eric.dumazet@...il.com>, netdev@...r.kernel.org,
        davem@...emloft.net, kuznet@....inr.ac.ru, yoshfuji@...ux-ipv6.org,
        kuba@...nel.org
Subject: Re: [PATCH] IPv6: Fix CPU contention on FIB6 GC



On 6/24/20 3:34 AM, Oliver Herms wrote:
> On 24.06.20 00:06, Michal Kubecek wrote:
>> On Tue, Jun 23, 2020 at 01:30:29AM +0200, Oliver Herms wrote:
>>>
>>> I'm encountering the issues due to cache entries that are created by 
>>> tnl_update_pmtu. However, I'm going to address that issue in another thread
>>> and patch.
>>>
>>> As entries in the cache can be caused on many ways this should be fixed on the GC
>>> level.
>>
>> Actually, not so many as starting with (IIRC) 4.2, IPv6 routing cache is
>> only used for exceptions (like PMTU), not for regular lookup results.
>>
> 
> Hi Michal,
> 
> Right. That is the intention. But reality is, that when sending IPv6 with an 
> MPLS encap route into a SIT/FOU tunnel, a cache entry is being created for every single
> destination on the tunnel. Now that IS a bug by itself and I'll shortly submit a 
> patch that should fix that issue.

Thanks !

> 
> However, when a tunnel uses PMTU, and a tunnel source received an ICMP packet too big
> for the tunnel destination, that triggers creation of IPv6 route cache entries
> (and for IPv4 entries in the corresponding data structure) for every destination for which
> packets are sent through the tunnel.
> 
> Both these attributes,
> 1. the presence or absence of, maybe spoofed, ICMP packet too big messages for the tunnel
> 2. the number of flows through a tunnel (attackers could just create more flows)
> are not fully under control by an operator.
> 
> Thus the assumption that if only max_size would be big enough, it would solve the problem, 
> it not correct.

Our intention is to get rid of the IPv6 garbage collection, so we need to make sure
we do not rely on it ;)

> 
> Regarding your argument making the limit "softer":
> There is only 2 functions that use/lock fib6_gc_lock:
> 1. fib6_net_init
> 2. fib6_run_gc 
> 
> fib6_net_init is only called when the network namespace is initialized.
> 
> fib6_run_gc clears all entries that are subject to garbage collection.
> There is no gain in doing that N times (with N = amount of CPUs) and spinlocking all CPUs. 
> Cleaning once is enough. A GC run is so short, that by the time the GC run is finished, 
> there's most probably no new entry in the cache that is ready to be removed.
> And even if there is. The next call to ip6_dst_gc will happen when a new entry
> is added to the cache. Thus I can't see how my patch makes time limit softer.
> 

This are really minor implementation details, we need to find the root cause.