[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20201027135524.912736526@linuxfoundation.org>
Date: Tue, 27 Oct 2020 14:46:30 +0100
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-kernel@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
stable@...r.kernel.org, Ido Schimmel <idosch@...dia.com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
David Ahern <dsahern@...il.com>,
Nikolay Aleksandrov <nikolay@...dia.com>,
Jakub Kicinski <kuba@...nel.org>
Subject: [PATCH 5.8 047/633] nexthop: Fix performance regression in nexthop deletion
From: Ido Schimmel <idosch@...dia.com>
[ Upstream commit df6afe2f7c19349de2ee560dc62ea4d9ad3ff889 ]
While insertion of 16k nexthops all using the same netdev ('dummy10')
takes less than a second, deletion takes about 130 seconds:
# time -p ip -b nexthop.batch
real 0.29
user 0.01
sys 0.15
# time -p ip link set dev dummy10 down
real 131.03
user 0.06
sys 0.52
This is because of repeated calls to synchronize_rcu() whenever a
nexthop is removed from a nexthop group:
# /usr/share/bcc/tools/offcputime -p `pgrep -nx ip` -K
...
b'finish_task_switch'
b'schedule'
b'schedule_timeout'
b'wait_for_completion'
b'__wait_rcu_gp'
b'synchronize_rcu.part.0'
b'synchronize_rcu'
b'__remove_nexthop'
b'remove_nexthop'
b'nexthop_flush_dev'
b'nh_netdev_event'
b'raw_notifier_call_chain'
b'call_netdevice_notifiers_info'
b'__dev_notify_flags'
b'dev_change_flags'
b'do_setlink'
b'__rtnl_newlink'
b'rtnl_newlink'
b'rtnetlink_rcv_msg'
b'netlink_rcv_skb'
b'rtnetlink_rcv'
b'netlink_unicast'
b'netlink_sendmsg'
b'____sys_sendmsg'
b'___sys_sendmsg'
b'__sys_sendmsg'
b'__x64_sys_sendmsg'
b'do_syscall_64'
b'entry_SYSCALL_64_after_hwframe'
- ip (277)
126554955
Since nexthops are always deleted under RTNL, synchronize_net() can be
used instead. It will call synchronize_rcu_expedited() which only blocks
for several microseconds as opposed to multiple milliseconds like
synchronize_rcu().
With this patch deletion of 16k nexthops takes less than a second:
# time -p ip link set dev dummy10 down
real 0.12
user 0.00
sys 0.04
Tested with fib_nexthops.sh which includes torture tests that prompted
the initial change:
# ./fib_nexthops.sh
...
Tests passed: 134
Tests failed: 0
Fixes: 90f33bffa382 ("nexthops: don't modify published nexthop groups")
Signed-off-by: Ido Schimmel <idosch@...dia.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@...el.com>
Reviewed-by: David Ahern <dsahern@...il.com>
Acked-by: Nikolay Aleksandrov <nikolay@...dia.com>
Link: https://lore.kernel.org/r/20201016172914.643282-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@...nel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
---
net/ipv4/nexthop.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -842,7 +842,7 @@ static void remove_nexthop_from_groups(s
remove_nh_grp_entry(net, nhge, nlinfo);
/* make sure all see the newly published array before releasing rtnl */
- synchronize_rcu();
+ synchronize_net();
}
static void remove_nexthop_group(struct nexthop *nh, struct nl_info *nlinfo)
Powered by blists - more mailing lists