lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue,  6 Sep 2016 14:01:38 +0200
From:   Jiri Pirko <jiri@...nulli.us>
To:     netdev@...r.kernel.org
Cc:     davem@...emloft.net, idosch@...lanox.com, eladr@...lanox.com,
        yotamg@...lanox.com, nogahf@...lanox.com, ogerlitz@...lanox.com,
        roopa@...ulusnetworks.com, nikolay@...ulusnetworks.com,
        linville@...driver.com, tgraf@...g.ch, gospo@...ulusnetworks.com,
        sfeldma@...il.com, ast@...mgrid.com, edumazet@...gle.com,
        hannes@...essinduktion.org, f.fainelli@...il.com,
        dsa@...ulusnetworks.com, jhs@...atatu.com,
        vivien.didelot@...oirfairelinux.com, john.fastabend@...el.com,
        andrew@...n.ch, ivecera@...hat.com
Subject: [patch net-next RFC 0/2] fib4 offload: notifier to let hw to be aware of all prefixes

From: Jiri Pirko <jiri@...lanox.com>

This is RFC, unfinished. I came across some issues in the process so I would
like to share those and restart the fib offload discussion in order to make it
really usable.

So the goal of this patchset is to allow driver to propagate all prefixes
configured in kernel down HW. This is necessary for routing to work
as expected. If we don't do that HW might forward prefixes known to kernel
incorrectly. Take an example when default route is set in switch HW and there
is an IP address set on a management (non-switch) port.

Currently, only fibs related to the switch port netdev are offloaded using
switchdev ops. This model is not extendable so the first patch introduces
a replacement: notifier to propagate fib additions and removals to whoever
interested. The second patch makes mlxsw to adopt this new way, registering
one notifier block for each mlxsw (asic) instance.

Using switchdev ops, "abort" is called by switchdev core whenever there is
an error during fib add offload. This leads to removal of all offloaded fibs on
system by fib_trie code.

Now the new notifier assumes the driver takes care of the abort action.
Here's why:
1) The fact that one HW cannot offload fib does not mean that the others can't
   do it. So let only one entity to abort and leave the rest to work happily.
2) The driver knows what to in order to properly abort. For example, currently
   abort is broken for mlxsw as for Spectrum there is a need to set 0.0.0.0/0
   trap in RALUE register.

Issues:
1) RTNH_F_OFFLOAD is originally set in switchdev core. There the assumption is
   that only one offload device exists. But for fib notifier, we assume
   multiple offload devices. When should the offload flag be set and by who?
   I think that it would make sense to have a per-fib reference counter
   for this:
   0 means RTNH_F_OFFLOAD is not set, no device offloads this entry
   n means RTNH_F_OFFLOAD is set and the fib entry is offloaded by n devices

2) Unabort? Would be nice. Currently when add_failure->abort happens,
   user's only option is to reboot the machine. I would like to make this
   nicer for the fib notifier implementation. Perhaps to provide some button in
   devlink which would tell driver to try to offload entries again? Not sure.

3) Policies. Not directly connected to this patchset but this issues
   we have been discussing couple of times and I still believe that
   the current state is not good.
   Software-only forwarding now happens in case of abort and makes the ASIC
   ports to act like dummy separate NICs. In case of Spectrum, the bandwidth
   of CPU port is something around 4Gbit. For 32x100Gbit ports this is
   simply not possible to handle. In case of abort, the system is broken
   as it really could not forward packets at a speed not even close
   to the expected.
   Here the policies come to the picture, allowing the user to set the
   system to behave according his expectations. For example rather
   fail to add the route than to abort to software forward.
   This policy could be per-ASIC, configurable by devlink.

Thoughts please?

Jiri Pirko (2):
  fib: introduce fib notification infrastructure
  mlxsw: spectrum_router: Use FIB notifications instead of switchdev
    calls

 drivers/net/ethernet/mellanox/mlxsw/spectrum.h     |   8 +-
 .../net/ethernet/mellanox/mlxsw/spectrum_router.c  | 257 ++++++++++-----------
 .../ethernet/mellanox/mlxsw/spectrum_switchdev.c   |   9 -
 include/net/ip_fib.h                               |  19 ++
 net/ipv4/fib_trie.c                                |  43 ++++
 5 files changed, 181 insertions(+), 155 deletions(-)

-- 
2.5.5

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ