[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200622214449.gyfn33ickesj2j2t@lion.mk-sys.cz>
Date: Mon, 22 Jun 2020 23:44:49 +0200
From: Michal Kubecek <mkubecek@...e.cz>
To: netdev@...r.kernel.org
Cc: Oliver Herms <oliver.peter.herms@...il.com>, davem@...emloft.net,
kuznet@....inr.ac.ru, yoshfuji@...ux-ipv6.org, kuba@...nel.org
Subject: Re: [PATCH] IPv6: Fix CPU contention on FIB6 GC
On Mon, Jun 22, 2020 at 10:53:55PM +0200, Oliver Herms wrote:
> When fib6_run_gc is called with parameter force=true the spinlock in
> /net/ipv6/ip6_fib.c:2310 can lock all CPUs in softirq when
> net.ipv6.route.max_size is exceeded (seen this multiple times).
> One sotirq/CPU get's the lock. All others spin to get it. It takes
> substantial time until all are done. Effectively it's a DOS vector.
>
> As the splinlock is only enforcing that there is at most one GC running
> at a time, it should IMHO be safe to use force=false here resulting
> in spin_trylock_bh instead of spin_lock_bh, thus avoiding the lock
> contention.
>
> Finding a locked spinlock means some GC is going on already so it is
> save to just skip another execution of the GC.
>
> Signed-off-by: Oliver Herms <oliver.peter.herms@...il.com>
I wonder if it wouldn't suffice to revert commit 14956643550f ("ipv6:
slight optimization in ip6_dst_gc") as the reasoning in its commit
message seems wrong: we do not always skip fib6_run_gc() when
entries <= rt_max_size, we do so only if the time since last garbage
collector run is shorter than rt_min_interval.
Then you would prevent the "thundering herd" effect when only gc_thresh
is exceeded but not max_size, as commit 2ac3ac8f86f2 ("ipv6: prevent
fib6_run_gc() contention") intended, but would still preserve enforced
garbage collect when max_size is exceeded.
Michal
> ---
> net/ipv6/route.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 82cbb46a2a4f..7e6fbaf43549 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -3205,7 +3205,7 @@ static int ip6_dst_gc(struct dst_ops *ops)
> goto out;
>
> net->ipv6.ip6_rt_gc_expire++;
> - fib6_run_gc(net->ipv6.ip6_rt_gc_expire, net, true);
> + fib6_run_gc(net->ipv6.ip6_rt_gc_expire, net, false);
> entries = dst_entries_get_slow(ops);
> if (entries < ops->gc_thresh)
> net->ipv6.ip6_rt_gc_expire = rt_gc_timeout>>1;
> --
> 2.25.1
>
Powered by blists - more mailing lists