[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191002181759.GE2279@nanopsycho>
Date: Wed, 2 Oct 2019 20:17:59 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: Ido Schimmel <idosch@...sch.org>
Cc: netdev@...r.kernel.org, davem@...emloft.net, dsahern@...il.com,
jiri@...lanox.com, jakub.kicinski@...ronome.com,
saeedm@...lanox.com, mlxsw@...lanox.com,
Ido Schimmel <idosch@...lanox.com>
Subject: Re: [RFC PATCH net-next 00/15] Simplify IPv4 route offload API
Wed, Oct 02, 2019 at 10:40:48AM CEST, idosch@...sch.org wrote:
>From: Ido Schimmel <idosch@...lanox.com>
>
>Today, whenever an IPv4 route is added or deleted a notification is sent
>in the FIB notification chain and it is up to offload drivers to decide
>if the route should be programmed to the hardware or not. This is not an
>easy task as in hardware routes are keyed by {prefix, prefix length,
>table id}, whereas the kernel can store multiple such routes that only
>differ in metric / TOS / nexthop info.
>
>This series makes sure that only routes that are actually used in the
>data path are notified to offload drivers. This greatly simplifies the
>work these drivers need to do, as they are now only concerned with
>programming the hardware and do not need to replicate the IPv4 route
>insertion logic and store multiple identical routes.
>
>The route that is notified is the first FIB alias in the FIB node with
>the given {prefix, prefix length, table ID}. In case the route is
>deleted and there is another route with the same key, a replace
>notification is emitted. Otherwise, a delete notification is emitted.
>
>The above means that in the case of multiple routes with the same key,
>but different TOS, only the route with the highest TOS is notified.
>While the kernel can route a packet based on its TOS, this is not
>supported by any hardware devices I'm familiar with. Moreover, this is
>not supported by IPv6 nor by BIRD/FRR from what I could see. Offload
>drivers should therefore use the presence of a non-zero TOS as an
>indication to trap packets matching the route and let the kernel route
>them instead. mlxsw has been doing it for the past two years.
>
>The series also adds an "in hardware" indication to routes, in addition
I think this might be a separate patchset. I mean patch "ipv4: Replace
route in list before notifying" and above.
>to the offload indication we already have on nexthops today. Besides
>being long overdue, the reason this is done in this series is that it
>makes it possible to easily test the new FIB notification API over
>netdevsim.
>
>To ensure there is no degradation in route insertion rates, I used
>Vincent Bernat's script [1][2] from [3] to inject 500,000 routes from an
>MRT dump from a router with a full view. On a system with Intel(R)
>Xeon(R) CPU D-1527 @ 2.20GHz I measured 8.184 seconds, averaged over 10
>runs and saw no degradation compared to net-next from today.
>
>Patchset overview:
>Patches #1-#7 introduce the new FIB notifications
>Patches #8-#9 convert listeners to make use of the new notifications
>Patches #10-#14 add "in hardware" indication for IPv4 routes, including
>a dummy FIB offload implementation in netdevsim
>Patch #15 adds a selftest for the new FIB notifications API over
>netdevsim
>
>The series is based on Jiri's "devlink: allow devlink instances to
>change network namespace" series [4]. The patches can be found here [5]
>and patched iproute2 with the "in hardware" indication can be found here
>[6].
>
>IPv6 is next on my TODO list.
>
>[1] https://github.com/vincentbernat/network-lab/blob/master/common/helpers/lab-routes-ipvX/insert-from-bgp
>[2] https://gist.github.com/idosch/2eb96efe50eb5234d205e964f0814859
>[3] https://vincent.bernat.ch/en/blog/2017-ipv4-route-lookup-linux
>[4] https://patchwork.ozlabs.org/cover/1162295/
>[5] https://github.com/idosch/linux/tree/fib-notifier
>[6] https://github.com/idosch/iproute2/tree/fib-notifier
>
>Ido Schimmel (15):
> ipv4: Add temporary events to the FIB notification chain
> ipv4: Notify route after insertion to the routing table
> ipv4: Notify route if replacing currently offloaded one
> ipv4: Notify newly added route if should be offloaded
> ipv4: Handle route deletion notification
> ipv4: Handle route deletion notification during flush
> ipv4: Only Replay routes of interest to new listeners
> mlxsw: spectrum_router: Start using new IPv4 route notifications
> ipv4: Remove old route notifications and convert listeners
> ipv4: Replace route in list before notifying
> ipv4: Encapsulate function arguments in a struct
> ipv4: Add "in hardware" indication to routes
> mlxsw: spectrum_router: Mark routes as "in hardware"
> netdevsim: fib: Mark routes as "in hardware"
> selftests: netdevsim: Add test for route offload API
>
> .../net/ethernet/mellanox/mlx5/core/lag_mp.c | 4 -
> .../ethernet/mellanox/mlxsw/spectrum_router.c | 152 ++-----
> drivers/net/ethernet/rocker/rocker_main.c | 4 +-
> drivers/net/netdevsim/fib.c | 263 ++++++++++-
> include/net/ip_fib.h | 5 +
> include/uapi/linux/rtnetlink.h | 1 +
> net/ipv4/fib_lookup.h | 18 +-
> net/ipv4/fib_semantics.c | 30 +-
> net/ipv4/fib_trie.c | 223 ++++++++--
> net/ipv4/route.c | 12 +-
> .../drivers/net/netdevsim/fib_notifier.sh | 411 ++++++++++++++++++
> 11 files changed, 938 insertions(+), 185 deletions(-)
> create mode 100755 tools/testing/selftests/drivers/net/netdevsim/fib_notifier.sh
>
>--
>2.21.0
>
Powered by blists - more mailing lists