[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1474622535-4002-1-git-send-email-jiri@resnulli.us>
Date: Fri, 23 Sep 2016 11:22:09 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: netdev@...r.kernel.org
Cc: davem@...emloft.net, idosch@...lanox.com, eladr@...lanox.com,
yotamg@...lanox.com, nogahf@...lanox.com, ogerlitz@...lanox.com,
roopa@...ulusnetworks.com, nikolay@...ulusnetworks.com,
linville@...driver.com, andy@...yhouse.net, f.fainelli@...il.com,
dsa@...ulusnetworks.com, jhs@...atatu.com,
vivien.didelot@...oirfairelinux.com, andrew@...n.ch,
ivecera@...hat.com, kaber@...sh.net, john@...ozen.org
Subject: [patch net-next v2 0/6] fib offload: switch to notifier
From: Jiri Pirko <jiri@...lanox.com>
The goal of this patchset is to allow driver to propagate all prefixes
configured in kernel down HW. This is necessary for routing to work
as expected. If we don't do that HW might forward prefixes known to kernel
incorrectly. Take an example when default route is set in switch HW and there
is an IP address set on a management (non-switch) port.
Currently, only FIB entries related to the switch port netdev are
offloaded using switchdev ops. This model is not extendable so the
first patch introduces a replacement: notifier to propagate FIB entry
additions and removals to whoever is interested.
The second patch introduces couple of helpers to deal with RTNH_F_OFFLOAD
flags. Currently it is set in switchdev core. There the assumption is
that only one offload device exists. But for FIB notifier, we assume
multiple offload devices. So the patch introduces a per FIB entry
reference counter and helpers use it in order to achieve this:
0 means RTNH_F_OFFLOAD is not set, no device offloads this entry
n means RTNH_F_OFFLOAD is set and the entry is offloaded by n devices
Patches 3 and 4 convert mlxsw and rocker to adopt this new way, registering
one notifier block for each asic instance. Both of these patches also
implement internal "abort" mechanism.
Using switchdev ops, "abort" is called by switchdev core whenever there is
an error during FIB entry add offload. This leads to removal of all
offloaded entries on system by fib_trie code.
Now the new notifier assumes the driver takes care of the abort action.
Here's why:
1) The fact that one HW cannot offload an entry does not mean that the
others can't do it. So let only one entity to abort and leave the rest
to work happily.
2) The driver knows what to in order to properly abort. For example,
currently abort is broken for mlxsw, as for Spectrum there is a need
to set 0.0.0.0/0 trap in RALUE register.
The fifth patch removes the old, no longer used FIB offload infrastructure.
The last patch reflects the changes into switchdev documentation file.
---
v1->v2:
-patch 3/6:
-fixed lpm tree setup and binding for abort and pointed out by Ido
-do nexthop checks as suggested by Ido
-fix use after free during abort
-patch 6/6:
-fixed tests and suggested by Ido
Jiri Pirko (6):
fib: introduce FIB notification infrastructure
fib: introduce FIB info offload flag helpers
mlxsw: spectrum_router: Use FIB notifications instead of switchdev
calls
rocker: use FIB notifications instead of switchdev calls
switchdev: remove FIB offload infrastructure
doc: update switchdev L3 section
Documentation/networking/switchdev.txt | 27 +-
drivers/net/ethernet/mellanox/mlxsw/spectrum.h | 9 +-
.../net/ethernet/mellanox/mlxsw/spectrum_router.c | 428 ++++++++++++---------
.../ethernet/mellanox/mlxsw/spectrum_switchdev.c | 9 -
drivers/net/ethernet/rocker/rocker.h | 15 +-
drivers/net/ethernet/rocker/rocker_main.c | 120 ++++--
drivers/net/ethernet/rocker/rocker_ofdpa.c | 115 ++++--
include/net/ip_fib.h | 49 ++-
include/net/switchdev.h | 40 --
net/ipv4/fib_frontend.c | 29 +-
net/ipv4/fib_rules.c | 12 +-
net/ipv4/fib_trie.c | 166 +++-----
net/switchdev/switchdev.c | 181 ---------
13 files changed, 577 insertions(+), 623 deletions(-)
--
2.5.5
Powered by blists - more mailing lists