[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f0cf2c4d-3432-e904-1d27-1de5c88e5b34@gmail.com>
Date: Fri, 16 Oct 2020 22:37:38 -0600
From: David Ahern <dsahern@...il.com>
To: Ido Schimmel <idosch@...sch.org>, netdev@...r.kernel.org
Cc: davem@...emloft.net, kuba@...nel.org, nikolay@...dia.com,
mlxsw@...dia.com, Ido Schimmel <idosch@...dia.com>
Subject: Re: [PATCH net] nexthop: Fix performance regression in nexthop
deletion
On 10/16/20 11:29 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@...dia.com>
>
> While insertion of 16k nexthops all using the same netdev ('dummy10')
> takes less than a second, deletion takes about 130 seconds:
>
> # time -p ip -b nexthop.batch
> real 0.29
> user 0.01
> sys 0.15
>
> # time -p ip link set dev dummy10 down
> real 131.03
> user 0.06
> sys 0.52
>
> This is because of repeated calls to synchronize_rcu() whenever a
> nexthop is removed from a nexthop group:
>
> # /usr/share/bcc/tools/offcputime -p `pgrep -nx ip` -K
> ...
> b'finish_task_switch'
> b'schedule'
> b'schedule_timeout'
> b'wait_for_completion'
> b'__wait_rcu_gp'
> b'synchronize_rcu.part.0'
> b'synchronize_rcu'
> b'__remove_nexthop'
> b'remove_nexthop'
> b'nexthop_flush_dev'
> b'nh_netdev_event'
> b'raw_notifier_call_chain'
> b'call_netdevice_notifiers_info'
> b'__dev_notify_flags'
> b'dev_change_flags'
> b'do_setlink'
> b'__rtnl_newlink'
> b'rtnl_newlink'
> b'rtnetlink_rcv_msg'
> b'netlink_rcv_skb'
> b'rtnetlink_rcv'
> b'netlink_unicast'
> b'netlink_sendmsg'
> b'____sys_sendmsg'
> b'___sys_sendmsg'
> b'__sys_sendmsg'
> b'__x64_sys_sendmsg'
> b'do_syscall_64'
> b'entry_SYSCALL_64_after_hwframe'
> - ip (277)
> 126554955
>
> Since nexthops are always deleted under RTNL, synchronize_net() can be
> used instead. It will call synchronize_rcu_expedited() which only blocks
> for several microseconds as opposed to multiple milliseconds like
> synchronize_rcu().
>
> With this patch deletion of 16k nexthops takes less than a second:
>
> # time -p ip link set dev dummy10 down
> real 0.12
> user 0.00
> sys 0.04
>
> Tested with fib_nexthops.sh which includes torture tests that prompted
> the initial change:
>
> # ./fib_nexthops.sh
> ...
> Tests passed: 134
> Tests failed: 0
>
> Fixes: 90f33bffa382 ("nexthops: don't modify published nexthop groups")
> Signed-off-by: Ido Schimmel <idosch@...dia.com>
> ---
> net/ipv4/nexthop.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
Thanks for finding this, Ido.
Reviewed-by: David Ahern <dsahern@...il.com>
Powered by blists - more mailing lists