lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191002181759.GE2279@nanopsycho>
Date:   Wed, 2 Oct 2019 20:17:59 +0200
From:   Jiri Pirko <jiri@...nulli.us>
To:     Ido Schimmel <idosch@...sch.org>
Cc:     netdev@...r.kernel.org, davem@...emloft.net, dsahern@...il.com,
        jiri@...lanox.com, jakub.kicinski@...ronome.com,
        saeedm@...lanox.com, mlxsw@...lanox.com,
        Ido Schimmel <idosch@...lanox.com>
Subject: Re: [RFC PATCH net-next 00/15] Simplify IPv4 route offload API

Wed, Oct 02, 2019 at 10:40:48AM CEST, idosch@...sch.org wrote:
>From: Ido Schimmel <idosch@...lanox.com>
>
>Today, whenever an IPv4 route is added or deleted a notification is sent
>in the FIB notification chain and it is up to offload drivers to decide
>if the route should be programmed to the hardware or not. This is not an
>easy task as in hardware routes are keyed by {prefix, prefix length,
>table id}, whereas the kernel can store multiple such routes that only
>differ in metric / TOS / nexthop info.
>
>This series makes sure that only routes that are actually used in the
>data path are notified to offload drivers. This greatly simplifies the
>work these drivers need to do, as they are now only concerned with
>programming the hardware and do not need to replicate the IPv4 route
>insertion logic and store multiple identical routes.
>
>The route that is notified is the first FIB alias in the FIB node with
>the given {prefix, prefix length, table ID}. In case the route is
>deleted and there is another route with the same key, a replace
>notification is emitted. Otherwise, a delete notification is emitted.
>
>The above means that in the case of multiple routes with the same key,
>but different TOS, only the route with the highest TOS is notified.
>While the kernel can route a packet based on its TOS, this is not
>supported by any hardware devices I'm familiar with. Moreover, this is
>not supported by IPv6 nor by BIRD/FRR from what I could see. Offload
>drivers should therefore use the presence of a non-zero TOS as an
>indication to trap packets matching the route and let the kernel route
>them instead. mlxsw has been doing it for the past two years.
>
>The series also adds an "in hardware" indication to routes, in addition

I think this might be a separate patchset. I mean patch "ipv4: Replace
route in list before notifying" and above.


>to the offload indication we already have on nexthops today. Besides
>being long overdue, the reason this is done in this series is that it
>makes it possible to easily test the new FIB notification API over
>netdevsim.
>
>To ensure there is no degradation in route insertion rates, I used
>Vincent Bernat's script [1][2] from [3] to inject 500,000 routes from an
>MRT dump from a router with a full view. On a system with Intel(R)
>Xeon(R) CPU D-1527 @ 2.20GHz I measured 8.184 seconds, averaged over 10
>runs and saw no degradation compared to net-next from today.
>
>Patchset overview:
>Patches #1-#7 introduce the new FIB notifications
>Patches #8-#9 convert listeners to make use of the new notifications
>Patches #10-#14 add "in hardware" indication for IPv4 routes, including
>a dummy FIB offload implementation in netdevsim
>Patch #15 adds a selftest for the new FIB notifications API over
>netdevsim
>
>The series is based on Jiri's "devlink: allow devlink instances to
>change network namespace" series [4]. The patches can be found here [5]
>and patched iproute2 with the "in hardware" indication can be found here
>[6].
>
>IPv6 is next on my TODO list.
>
>[1] https://github.com/vincentbernat/network-lab/blob/master/common/helpers/lab-routes-ipvX/insert-from-bgp
>[2] https://gist.github.com/idosch/2eb96efe50eb5234d205e964f0814859
>[3] https://vincent.bernat.ch/en/blog/2017-ipv4-route-lookup-linux
>[4] https://patchwork.ozlabs.org/cover/1162295/
>[5] https://github.com/idosch/linux/tree/fib-notifier
>[6] https://github.com/idosch/iproute2/tree/fib-notifier
>
>Ido Schimmel (15):
>  ipv4: Add temporary events to the FIB notification chain
>  ipv4: Notify route after insertion to the routing table
>  ipv4: Notify route if replacing currently offloaded one
>  ipv4: Notify newly added route if should be offloaded
>  ipv4: Handle route deletion notification
>  ipv4: Handle route deletion notification during flush
>  ipv4: Only Replay routes of interest to new listeners
>  mlxsw: spectrum_router: Start using new IPv4 route notifications
>  ipv4: Remove old route notifications and convert listeners
>  ipv4: Replace route in list before notifying
>  ipv4: Encapsulate function arguments in a struct
>  ipv4: Add "in hardware" indication to routes
>  mlxsw: spectrum_router: Mark routes as "in hardware"
>  netdevsim: fib: Mark routes as "in hardware"
>  selftests: netdevsim: Add test for route offload API
>
> .../net/ethernet/mellanox/mlx5/core/lag_mp.c  |   4 -
> .../ethernet/mellanox/mlxsw/spectrum_router.c | 152 ++-----
> drivers/net/ethernet/rocker/rocker_main.c     |   4 +-
> drivers/net/netdevsim/fib.c                   | 263 ++++++++++-
> include/net/ip_fib.h                          |   5 +
> include/uapi/linux/rtnetlink.h                |   1 +
> net/ipv4/fib_lookup.h                         |  18 +-
> net/ipv4/fib_semantics.c                      |  30 +-
> net/ipv4/fib_trie.c                           | 223 ++++++++--
> net/ipv4/route.c                              |  12 +-
> .../drivers/net/netdevsim/fib_notifier.sh     | 411 ++++++++++++++++++
> 11 files changed, 938 insertions(+), 185 deletions(-)
> create mode 100755 tools/testing/selftests/drivers/net/netdevsim/fib_notifier.sh
>
>-- 
>2.21.0
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ