[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20191002084103.12138-1-idosch@idosch.org>
Date: Wed, 2 Oct 2019 11:40:48 +0300
From: Ido Schimmel <idosch@...sch.org>
To: netdev@...r.kernel.org
Cc: davem@...emloft.net, dsahern@...il.com, jiri@...lanox.com,
jakub.kicinski@...ronome.com, saeedm@...lanox.com,
mlxsw@...lanox.com, Ido Schimmel <idosch@...lanox.com>
Subject: [RFC PATCH net-next 00/15] Simplify IPv4 route offload API
From: Ido Schimmel <idosch@...lanox.com>
Today, whenever an IPv4 route is added or deleted a notification is sent
in the FIB notification chain and it is up to offload drivers to decide
if the route should be programmed to the hardware or not. This is not an
easy task as in hardware routes are keyed by {prefix, prefix length,
table id}, whereas the kernel can store multiple such routes that only
differ in metric / TOS / nexthop info.
This series makes sure that only routes that are actually used in the
data path are notified to offload drivers. This greatly simplifies the
work these drivers need to do, as they are now only concerned with
programming the hardware and do not need to replicate the IPv4 route
insertion logic and store multiple identical routes.
The route that is notified is the first FIB alias in the FIB node with
the given {prefix, prefix length, table ID}. In case the route is
deleted and there is another route with the same key, a replace
notification is emitted. Otherwise, a delete notification is emitted.
The above means that in the case of multiple routes with the same key,
but different TOS, only the route with the highest TOS is notified.
While the kernel can route a packet based on its TOS, this is not
supported by any hardware devices I'm familiar with. Moreover, this is
not supported by IPv6 nor by BIRD/FRR from what I could see. Offload
drivers should therefore use the presence of a non-zero TOS as an
indication to trap packets matching the route and let the kernel route
them instead. mlxsw has been doing it for the past two years.
The series also adds an "in hardware" indication to routes, in addition
to the offload indication we already have on nexthops today. Besides
being long overdue, the reason this is done in this series is that it
makes it possible to easily test the new FIB notification API over
netdevsim.
To ensure there is no degradation in route insertion rates, I used
Vincent Bernat's script [1][2] from [3] to inject 500,000 routes from an
MRT dump from a router with a full view. On a system with Intel(R)
Xeon(R) CPU D-1527 @ 2.20GHz I measured 8.184 seconds, averaged over 10
runs and saw no degradation compared to net-next from today.
Patchset overview:
Patches #1-#7 introduce the new FIB notifications
Patches #8-#9 convert listeners to make use of the new notifications
Patches #10-#14 add "in hardware" indication for IPv4 routes, including
a dummy FIB offload implementation in netdevsim
Patch #15 adds a selftest for the new FIB notifications API over
netdevsim
The series is based on Jiri's "devlink: allow devlink instances to
change network namespace" series [4]. The patches can be found here [5]
and patched iproute2 with the "in hardware" indication can be found here
[6].
IPv6 is next on my TODO list.
[1] https://github.com/vincentbernat/network-lab/blob/master/common/helpers/lab-routes-ipvX/insert-from-bgp
[2] https://gist.github.com/idosch/2eb96efe50eb5234d205e964f0814859
[3] https://vincent.bernat.ch/en/blog/2017-ipv4-route-lookup-linux
[4] https://patchwork.ozlabs.org/cover/1162295/
[5] https://github.com/idosch/linux/tree/fib-notifier
[6] https://github.com/idosch/iproute2/tree/fib-notifier
Ido Schimmel (15):
ipv4: Add temporary events to the FIB notification chain
ipv4: Notify route after insertion to the routing table
ipv4: Notify route if replacing currently offloaded one
ipv4: Notify newly added route if should be offloaded
ipv4: Handle route deletion notification
ipv4: Handle route deletion notification during flush
ipv4: Only Replay routes of interest to new listeners
mlxsw: spectrum_router: Start using new IPv4 route notifications
ipv4: Remove old route notifications and convert listeners
ipv4: Replace route in list before notifying
ipv4: Encapsulate function arguments in a struct
ipv4: Add "in hardware" indication to routes
mlxsw: spectrum_router: Mark routes as "in hardware"
netdevsim: fib: Mark routes as "in hardware"
selftests: netdevsim: Add test for route offload API
.../net/ethernet/mellanox/mlx5/core/lag_mp.c | 4 -
.../ethernet/mellanox/mlxsw/spectrum_router.c | 152 ++-----
drivers/net/ethernet/rocker/rocker_main.c | 4 +-
drivers/net/netdevsim/fib.c | 263 ++++++++++-
include/net/ip_fib.h | 5 +
include/uapi/linux/rtnetlink.h | 1 +
net/ipv4/fib_lookup.h | 18 +-
net/ipv4/fib_semantics.c | 30 +-
net/ipv4/fib_trie.c | 223 ++++++++--
net/ipv4/route.c | 12 +-
.../drivers/net/netdevsim/fib_notifier.sh | 411 ++++++++++++++++++
11 files changed, 938 insertions(+), 185 deletions(-)
create mode 100755 tools/testing/selftests/drivers/net/netdevsim/fib_notifier.sh
--
2.21.0
Powered by blists - more mailing lists