[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211025222415.983883-1-vladimir.oltean@nxp.com>
Date: Tue, 26 Oct 2021 01:24:00 +0300
From: Vladimir Oltean <vladimir.oltean@....com>
To: netdev@...r.kernel.org, Roopa Prabhu <roopa@...dia.com>,
Nikolay Aleksandrov <nikolay@...dia.com>,
Ido Schimmel <idosch@...dia.com>
Cc: Jakub Kicinski <kuba@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Andrew Lunn <andrew@...n.ch>,
Florian Fainelli <f.fainelli@...il.com>,
Vivien Didelot <vivien.didelot@...il.com>,
Vladimir Oltean <olteanv@...il.com>,
Jiri Pirko <jiri@...dia.com>
Subject: [RFC PATCH net-next 00/15] Synchronous feedback on FDB add/del from switchdev to the bridge
Hello, this is me bombarding the list with switchdev FDB changes again.
This series attempts to address one design limitation in the interaction
between the bridge and switchdev: error codes returned from the
SWITCHDEV_FDB_ADD_TO_DEVICE and SWITCHDEV_FDB_DEL_TO_DEVICE handlers are
completely ignored.
There are multiple aspects to that. First of all, drivers have a portion
that handles those switchdev events in atomic context, and a portion
that handles them in a private deferred work context. Errors reported
from both calling contexts are ignored by the bridge, and it is
desirable to actually propagate both to user space.
Secondly, it is in fact okay that some switchdev errors are ignored.
The call graph for fdb_notify() is not simple, it looks something like
this (not complete):
IFLA_BRPORT_FLUSH RTM_NEWNEIGH
| |
| {br,nbp}_vlan_delete br_fdb_change_mac_address v
| | | | fast __br_fdb_add
| | | del_nbp, br_dev_delete br_fdb_changeaddr | path / | \
| | | | | | learning / | \
\ | -------------------- br_fdb_find_delete_local | | | / | \ switchdev event
\ | | | | | | / | \ listener
-------------------------- br_fdb_delete_by_port | | | | / | \ |
| | | | | | / | \ |
| | | | | | / | \ |
| | | | | br_fdb_update | br_fdb_external_learn_add
(RTM_DELNEIGH) br_fdb_delete | | | | | | | |
| | | | | | | | | gc_work netdevice
| | | | | | | fdb_add_entry | timer event
| | fdb_delete_local | | | | listener
__br_fdb_delete | | | | / br_fdb_cleanup |
| | | | | / | | br_stp_change_bridge_id
| | | \ | / | br_fdb_changeaddr |
| | | \ | / | | |
fdb_delete_by_addr_and_port | | fdb_insert \ | / ----/ | br_fdb_change_mac_address
| | | | \ | / / | |
br_fdb_external_learn_del | | | | br_fdb_cleanup \ | / / | | br_fdb_insert
| | | | | | \ | / ----/ | | |
| | | | | | \ | / / fdb_insert
br_fdb_flush | | | | | | \ | / / --------/
\---- | | | | | | \ | / / ------/
\----------- fdb_delete --------------- fdb_notify ---------/
There's not a lot that the fast path learning can do about switchdev
when that returns an error.
So this patch set mainly wants to deal with the 2 code paths that are
triggered by these regular commands:
bridge fdb add dev swp0 00:01:02:03:04:05 master static # __br_fdb_add
bridge fdb del dev swp0 00:01:02:03:04:05 master static # __br_fdb_delete
In some other, semi-related discussions, Ido Schimmel pointed out that
it would be nice if user space got some feedback from the actual driver,
and made some proposals about how that could be done.
https://patchwork.kernel.org/project/netdevbpf/cover/20210819160723.2186424-1-vladimir.oltean@nxp.com/
One of the proposals was to call fdb_notify() from sleepable context,
but Nikolay disliked the idea of introducing deferred work in the bridge
driver (seems like nobody wants to deal with it).
And since all proposals of dealing with the deferred work inside
switchdev were also shot down for valid reasons, we are basically left
as a baseline with the code that we have today, with the deferred work
being private to the driver, and somehow we must propagate an err and an
extack from there.
So the approach taken here is to reorganize the code a bit and add some
hooks in:
(a) some callers of the fdb_notify() function to initialize a completion
structure
(b) some drivers that catch SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE and mark
that completion structure as done
(c) some bridge logic that I believe is fairly safe (I'm open to being
proven wrong) that temporarily drops the &br->hash_lock in order to
sleep until the completion is done.
There are some further optimizations that can be made. For example, we
can avoid dropping the hash_lock if there is no switchdev response pending.
And we can move some of that completion logic in br_switchdev.c such
that it is compiled out on a CONFIG_NET_SWITCHDEV=n build. I haven't
done those here, since they aren't exactly trivial. Mainly searching for
high-level feedback first and foremost.
The structure of the patch series is:
- patches 1-6 are me toying around with some code organization while I
was trying to understand the various call paths better. I like not
having forward declarations, but if they exist for a reason, I can
drop these patches.
- patches 7-10 and 12 are some preparation work that can also be ignored.
- patches 11 and 13 are where the meat of the series is.
- patches 14 and 15 are DSA boilerplate so I could test what I'm doing.
Vladimir Oltean (15):
net: bridge: remove fdb_notify forward declaration
net: bridge: remove fdb_insert forward declaration
net: bridge: rename fdb_insert to fdb_add_local
net: bridge: rename br_fdb_insert to br_fdb_add_local
net: bridge: move br_fdb_replay inside br_switchdev.c
net: bridge: create a common function for populating switchdev FDB
entries
net: switchdev: keep the MAC address by value in struct
switchdev_notifier_fdb_info
net: bridge: take the hash_lock inside fdb_add_entry
net: bridge: rename fdb_notify to br_fdb_notify
net: switchdev: merge switchdev_handle_fdb_{add,del}_to_device
net: bridge: make fdb_add_entry() wait for switchdev feedback
net: rtnetlink: pass extack to .ndo_fdb_del
net: bridge: wait for errors from switchdev when deleting FDB entries
net: dsa: propagate feedback to SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE
net: dsa: propagate extack to .port_fdb_{add,del}
drivers/net/dsa/b53/b53_common.c | 6 +-
drivers/net/dsa/b53/b53_priv.h | 6 +-
drivers/net/dsa/hirschmann/hellcreek.c | 6 +-
drivers/net/dsa/lan9303-core.c | 7 +-
drivers/net/dsa/lantiq_gswip.c | 6 +-
drivers/net/dsa/microchip/ksz9477.c | 6 +-
drivers/net/dsa/mt7530.c | 6 +-
drivers/net/dsa/mv88e6xxx/chip.c | 6 +-
drivers/net/dsa/ocelot/felix.c | 6 +-
drivers/net/dsa/qca8k.c | 6 +-
drivers/net/dsa/sja1105/sja1105_main.c | 12 +-
.../ethernet/freescale/dpaa2/dpaa2-switch.c | 14 +-
drivers/net/ethernet/intel/ice/ice_main.c | 4 +-
.../marvell/prestera/prestera_switchdev.c | 18 +-
.../mellanox/mlx5/core/en/rep/bridge.c | 10 -
.../ethernet/mellanox/mlx5/core/esw/bridge.c | 2 +-
.../ethernet/mellanox/mlxsw/spectrum_router.c | 4 +-
.../mellanox/mlxsw/spectrum_switchdev.c | 11 +-
.../microchip/sparx5/sparx5_mactable.c | 2 +-
.../microchip/sparx5/sparx5_switchdev.c | 12 +-
drivers/net/ethernet/mscc/ocelot_net.c | 3 +-
.../net/ethernet/qlogic/qlcnic/qlcnic_main.c | 5 +-
drivers/net/ethernet/rocker/rocker_main.c | 13 +-
drivers/net/ethernet/rocker/rocker_ofdpa.c | 2 +-
drivers/net/ethernet/ti/am65-cpsw-switchdev.c | 14 +-
drivers/net/ethernet/ti/cpsw_switchdev.c | 14 +-
drivers/net/macvlan.c | 3 +-
drivers/net/vxlan.c | 3 +-
drivers/s390/net/qeth_l2_main.c | 11 +-
include/linux/netdevice.h | 6 +-
include/net/dsa.h | 6 +-
include/net/switchdev.h | 67 +-
net/bridge/br_fdb.c | 657 ++++++++++--------
net/bridge/br_if.c | 2 +-
net/bridge/br_private.h | 25 +-
net/bridge/br_switchdev.c | 103 ++-
net/bridge/br_vlan.c | 5 +-
net/core/rtnetlink.c | 4 +-
net/dsa/dsa_priv.h | 15 +-
net/dsa/port.c | 13 +-
net/dsa/slave.c | 87 +--
net/dsa/switch.c | 22 +-
net/switchdev/switchdev.c | 156 +----
43 files changed, 678 insertions(+), 708 deletions(-)
--
2.25.1
Powered by blists - more mailing lists