netdev - Re: [PATCH] IPv6: Fix CPU contention on FIB6 GC

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3588d0fc-5c6b-b62a-f137-24abcf660d5f@gmail.com>
Date:   Tue, 23 Jun 2020 00:46:34 +0200
From:   Oliver Herms <oliver.peter.herms@...il.com>
To:     Michal Kubecek <mkubecek@...e.cz>, netdev@...r.kernel.org
Cc:     davem@...emloft.net, kuznet@....inr.ac.ru, yoshfuji@...ux-ipv6.org,
        kuba@...nel.org
Subject: Re: [PATCH] IPv6: Fix CPU contention on FIB6 GC

On 22.06.20 23:44, Michal Kubecek wrote:
> On Mon, Jun 22, 2020 at 10:53:55PM +0200, Oliver Herms wrote:
>> When fib6_run_gc is called with parameter force=true the spinlock in
>> /net/ipv6/ip6_fib.c:2310 can lock all CPUs in softirq when
>> net.ipv6.route.max_size is exceeded (seen this multiple times).
>> One sotirq/CPU get's the lock. All others spin to get it. It takes
>> substantial time until all are done. Effectively it's a DOS vector.
>>
>> As the splinlock is only enforcing that there is at most one GC running
>> at a time, it should IMHO be safe to use force=false here resulting
>> in spin_trylock_bh instead of spin_lock_bh, thus avoiding the lock
>> contention.
>>
>> Finding a locked spinlock means some GC is going on already so it is
>> save to just skip another execution of the GC.
>>
>> Signed-off-by: Oliver Herms <oliver.peter.herms@...il.com>
> 
> I wonder if it wouldn't suffice to revert commit 14956643550f ("ipv6:
> slight optimization in ip6_dst_gc") as the reasoning in its commit
> message seems wrong: we do not always skip fib6_run_gc() when
> entries <= rt_max_size, we do so only if the time since last garbage
> collector run is shorter than rt_min_interval.
> 
> Then you would prevent the "thundering herd" effect when only gc_thresh
> is exceeded but not max_size, as commit 2ac3ac8f86f2 ("ipv6: prevent
> fib6_run_gc() contention") intended, but would still preserve enforced
> garbage collect when max_size is exceeded.
> 
> Michal
> 

Hi Michal,

I did some testing with packets causing 17k IPv6 route cache entries per 
second. With "entries > rt_max_size" all CPUs of the system get stuck 
waiting for the spinlock. With "false" CPU load stays at <<10% on every single
CPU core (tested on an Atom C2750). This makes sense as "entries > rt_max_size"
would not prevent multiple CPUs from trying to get the lock.

So reverting 14956643550f is not enough.

Kind Regards
Oliver