[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200513164140.7956-1-pablo@netfilter.org>
Date: Wed, 13 May 2020 18:41:32 +0200
From: Pablo Neira Ayuso <pablo@...filter.org>
To: netfilter-devel@...r.kernel.org
Cc: davem@...emloft.net, netdev@...r.kernel.org, paulb@...lanox.com,
ozsh@...lanox.com, vladbu@...lanox.com, jiri@...nulli.us,
kuba@...nel.org, saeedm@...lanox.com, michael.chan@...adcom.com
Subject: [PATCH 0/8 net] the indirect flow_block offload, revisited
Hi,
This patchset fixes the indirect flow_block support for the tc CT action
offload. Please, note that this batch is probably slightly large for the
net tree, however, I could not find a simple incremental fix.
= The problem
The nf_flow_table_indr_block_cb() function provides the tunnel netdevice
and the indirect flow_block driver callback. From this tunnel netdevice,
it is not possible to obtain the tc CT flow_block. Note that tc qdisc
and netfilter backtrack from the tunnel netdevice to the tc block /
netfilter chain to reach the flow_block object. This allows them to
clean up the hardware offload rules if the tunnel device is removed.
= What is the indirect flow_block infrastructure?
The indirect flow_block infrastructure allows drivers to offload
tc/netfilter rules that belong to software tunnel netdevices, e.g.
vxlan.
This indirect flow_block infrastructure relates tunnel netdevices with
drivers because there is no obvious way to relate these two things
from the control plane.
= How does the indirect flow_block work before this patchset?
Front-ends register the indirect flow_block callback through
flow_indr_add_block_cb() if they support for offloading tunnel
netdevices.
== Setting up an indirect flow_block
1) Drivers track tunnel netdevices via NETDEV_{REGISTER,UNREGISTER} events.
If there is a new tunnel netdevice that the driver can offload, then the
driver invokes __flow_indr_block_cb_register() with the new tunnel
netdevice and the driver callback. The __flow_indr_block_cb_register()
call iterates over the list of the front-end callbacks.
2) The front-end callback sets up the flow_block_offload structure and it
invokes the driver callback to set up the flow_block.
3) The driver callback now registers the flow_block structure and it
returns the flow_block back to the front-end.
4) The front-end gets the flow_block object and it is now ready to
offload rules for this tunnel netdevice.
A simplified callgraph is represented below.
Front-end Driver
NETDEV_REGISTER
|
__flow_indr_block_cb_register(netdev, cb_priv, driver_cb)
| [1]
.---------- frontend_indr_block_cb(cb_priv, driver_cb)
|
setup_flow_block_offload(bo)
| [2]
driver_cb(bo, cb_priv) ---------------.
|
set up flow_blocks [3]
|
add rules to flow_block <-------------'
TC_SETUP_CLSFLOWER [4]
== Releasing the indirect flow_block
There are two possibilities, either tunnel netdevice is removed or
a netdevice (port representor) is removed.
=== Tunnel netdevice is removed
Driver waits for the NETDEV_UNREGISTER event that announces the tunnel
netdevice removal. Then, it calls __flow_indr_block_cb_unregister() to
remove the flow_block and rules. Callgraph is very similar to the one
described above.
=== Netdevice is removed (port representor)
Driver calls __flow_indr_block_cb_unregister() to remove the existing
netfilter/tc rule that belong to the tunnel netdevice.
= How does the indirect flow_block work after this patchset?
Drivers register the indirect flow_block setup callback through
flow_indr_dev_register() if they support for offloading tunnel
netdevices.
== Setting up an indirect flow_block
1) Frontends check if dev->netdev_ops->ndo_setup_tc is unset. If so,
frontends call flow_indr_dev_setup_offload(). This call invokes
the drivers' indirect flow_block setup callback.
2) The indirect flow_block setup callback sets up a flow_block structure
which relates the tunnel netdevice and the driver.
3) The front-end uses flow_block and offload the rules.
Note that the operational to set up (non-indirect) flow_block is very
similar.
== Releasing the indirect flow_block
=== Tunnel netdevice is removed
This calls flow_indr_dev_setup_offload() to set down the flow_block and
remove the offloaded rules. This alternate path is exercised if
dev->netdev_ops->ndo_setup_tc is unset.
=== Netdevice is removed (port representor)
If a netdevice is removed, then it might need to to clean up the
offloaded tc/netfilter rules that belongs to the tunnel netdevice:
1) The driver invokes flow_indr_dev_unregister() when a netdevice is
removed.
2) This call iterates over the existing indirect flow_blocks
and it invokes the cleanup callback to let the front-end remove the
tc/netfilter rules. The cleanup callback already provides the
flow_block that the front-end needs to clean up.
Front-end Driver
|
flow_indr_dev_unregister(...)
|
iterate over list of indirect flow_block
and invoke cleanup callback
|
.-----------------------------
|
.
frontend_flow_block_cleanup(flow_block)
.
|
\/
remove rules to flow_block
TC_SETUP_CLSFLOWER
= About this patchset
This patchset aims to address the existing TC CT problem while
simplifying the indirect flow_block infrastructure. Saving 300 LoC in
the flow_offload core and the drivers. The operational gets aligned with
the (non-indirect) flow_blocks logic. Patchset is composed of:
Patch #1 add nf_flow_table_gc_cleanup() which is required by the
netfilter's flowtable new indirect flow_block approach.
Patch #2 adds the flow_block_indr object which is actually part of
of the flow_block object. This stores the indirect flow_block
metadata such as the tunnel netdevice owner and the cleanup
callback (in case the tunnel netdevice goes away).
This patch adds flow_indr_dev_{un}register() to allow drivers
to offer netdevice tunnel hardware offload to the front-ends.
Then, front-ends call flow_indr_dev_setup_offload() to invoke
the drivers to set up the (indirect) flow_block.
Patch #3 add the tcf_block_offload_init() helper function, this is
a preparation patch to adapt the tc front-end to use this
new indirect flow_block infrastructure.
Patch #4 updates the tc and netfilter front-ends to use the new
indirect flow_block infrastructure.
Patch #5 updates the mlx5 driver to use the new indirect flow_block
infrastructure.
Patch #6 updates the nfp driver to use the new indirect flow_block
infrastructure.
Patch #7 updates the bnxt driver to use the new indirect flow_block
infrastructure.
Patch #8 removes the indirect flow_block infrastructure version 1,
now that frontends and drivers have been translated to
version 2 (coming in this patchset).
Please, apply.
Pablo Neira Ayuso (8):
netfilter: nf_flowtable: expose nf_flow_table_gc_cleanup()
net: flow_offload: consolidate indirect flow_block infrastructure
net: cls_api: add tcf_block_offload_init()
net: use flow_indr_dev_setup_offload()
mlx5: update indirect block support
nfp: update indirect block support
bnxt_tc: update indirect block support
net: remove indirect block netdev event registration
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 -
drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 51 +--
.../net/ethernet/mellanox/mlx5/core/en_rep.c | 83 +----
.../net/ethernet/mellanox/mlx5/core/en_rep.h | 5 -
.../net/ethernet/netronome/nfp/flower/main.c | 11 +-
.../net/ethernet/netronome/nfp/flower/main.h | 7 +-
.../ethernet/netronome/nfp/flower/offload.c | 35 +-
include/net/flow_offload.h | 28 +-
include/net/netfilter/nf_flow_table.h | 2 +
net/core/flow_offload.c | 301 +++++++-----------
net/netfilter/nf_flow_table_core.c | 6 +-
net/netfilter/nf_flow_table_offload.c | 85 +----
net/netfilter/nf_tables_offload.c | 69 ++--
net/sched/cls_api.c | 157 +++------
14 files changed, 251 insertions(+), 590 deletions(-)
--
2.20.1
Powered by blists - more mailing lists