[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210728182748.3564726-2-vladimir.oltean@nxp.com>
Date: Wed, 28 Jul 2021 21:27:47 +0300
From: Vladimir Oltean <vladimir.oltean@....com>
To: netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>,
"David S. Miller" <davem@...emloft.net>
Cc: Florian Fainelli <f.fainelli@...il.com>,
Andrew Lunn <andrew@...n.ch>,
Vivien Didelot <vivien.didelot@...il.com>,
Ido Schimmel <idosch@...dia.com>, Jiri Pirko <jiri@...dia.com>,
Roopa Prabhu <roopa@...dia.com>,
Nikolay Aleksandrov <nikolay@...dia.com>,
Tobias Waldekranz <tobias@...dekranz.com>
Subject: [PATCH net-next 1/2] net: bridge: switchdev: replay the entire FDB for each port
Currently when a switchdev port joins a bridge, we replay all FDB
entries pointing towards that port or towards the bridge.
However, this is insufficient in certain situations:
(a) DSA, through its assisted_learning_on_cpu_port logic, snoops
dynamically learned FDB entries on foreign interfaces.
These are FDB entries that are pointing neither towards the newly
joined switchdev port, nor towards the bridge. So these addresses
would be missed when joining a bridge where a foreign interface has
already learned some addresses, and they would also linger on if the
DSA port leaves the bridge before the foreign interface forgets them.
None of this happens if we replay the entire FDB when the port joins.
(b) There is a desire to treat local FDB entries on a port (i.e. the
port's termination MAC address) identically to FDB entries pointing
towards the bridge itself. More details on the reason behind this in
the next patch. The point is that this cannot be done given the
current structure of br_fdb_replay() in this situation:
ip link set swp0 master br0 # br0 inherits its MAC address from swp0
ip link set swp1 master br0
What is desirable is that when swp1 joins the bridge, br_fdb_replay()
also notifies swp1 of br0's MAC address, but this won't in fact
happen because the MAC address of br0 does not have fdb->dst == NULL
(it doesn't point towards the bridge), but it has fdb->dst == swp0.
So our current logic makes it impossible for that address to be
replayed. But if we dump the entire FDB instead of just the entries
with fdb->dst == swp1 and fdb->dst == NULL, then the inherited MAC
address of br0 will be replayed too, which is what we need.
A natural question arises: say there is an FDB entry to be replayed,
like a MAC address dynamically learned on a foreign interface that
belongs to a bridge where no switchdev port has joined yet. If 10
switchdev ports belonging to the same driver join this bridge, one by
one, won't every port get notified 10 times of the foreign FDB entry,
amounting to a total of 100 notifications for this FDB entry in the
switchdev driver?
Well, yes, but this is where the "void *ctx" argument for br_fdb_replay
is useful: every port of the switchdev driver is notified whenever any
other port requests an FDB replay, but because the replay was initiated
by a different port, its context is different from the initiating port's
context, so it ignores those replays.
So the foreign FDB entry will be installed only 10 times, once per port.
This is done so that the following 4 code paths are always well balanced:
(a) addition of foreign FDB entry is replayed when port joins bridge
(b) deletion of foreign FDB entry is replayed when port leaves bridge
(c) addition of foreign FDB entry is notified to all ports currently in bridge
(c) deletion of foreign FDB entry is notified to all ports currently in bridge
Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
---
net/bridge/br_fdb.c | 23 +++++++----------------
net/bridge/br_private.h | 4 ++--
net/bridge/br_switchdev.c | 14 ++------------
3 files changed, 11 insertions(+), 30 deletions(-)
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 5b345bb72078..be75889ceeba 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -732,11 +732,12 @@ static inline size_t fdb_nlmsg_size(void)
+ nla_total_size(sizeof(u8)); /* NFEA_ACTIVITY_NOTIFY */
}
-static int br_fdb_replay_one(struct notifier_block *nb,
+static int br_fdb_replay_one(struct net_bridge *br, struct notifier_block *nb,
const struct net_bridge_fdb_entry *fdb,
- struct net_device *dev, unsigned long action,
- const void *ctx)
+ unsigned long action, const void *ctx)
{
+ const struct net_bridge_port *p = READ_ONCE(fdb->dst);
+ struct net_device *dev = p ? p->dev : br->dev;
struct switchdev_notifier_fdb_info item;
int err;
@@ -752,8 +753,8 @@ static int br_fdb_replay_one(struct notifier_block *nb,
return notifier_to_errno(err);
}
-int br_fdb_replay(const struct net_device *br_dev, const struct net_device *dev,
- const void *ctx, bool adding, struct notifier_block *nb)
+int br_fdb_replay(const struct net_device *br_dev, const void *ctx, bool adding,
+ struct notifier_block *nb)
{
struct net_bridge_fdb_entry *fdb;
struct net_bridge *br;
@@ -766,9 +767,6 @@ int br_fdb_replay(const struct net_device *br_dev, const struct net_device *dev,
if (!netif_is_bridge_master(br_dev))
return -EINVAL;
- if (!netif_is_bridge_port(dev) && !netif_is_bridge_master(dev))
- return -EINVAL;
-
br = netdev_priv(br_dev);
if (adding)
@@ -779,14 +777,7 @@ int br_fdb_replay(const struct net_device *br_dev, const struct net_device *dev,
rcu_read_lock();
hlist_for_each_entry_rcu(fdb, &br->fdb_list, fdb_node) {
- const struct net_bridge_port *dst = READ_ONCE(fdb->dst);
- struct net_device *dst_dev;
-
- dst_dev = dst ? dst->dev : br->dev;
- if (dst_dev && dst_dev != dev)
- continue;
-
- err = br_fdb_replay_one(nb, fdb, dst_dev, action, ctx);
+ err = br_fdb_replay_one(br, nb, fdb, action, ctx);
if (err)
break;
}
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index f2d34ea1ea37..c939631428b9 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -777,8 +777,8 @@ int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
bool swdev_notify);
void br_fdb_offloaded_set(struct net_bridge *br, struct net_bridge_port *p,
const unsigned char *addr, u16 vid, bool offloaded);
-int br_fdb_replay(const struct net_device *br_dev, const struct net_device *dev,
- const void *ctx, bool adding, struct notifier_block *nb);
+int br_fdb_replay(const struct net_device *br_dev, const void *ctx, bool adding,
+ struct notifier_block *nb);
/* br_forward.c */
enum br_pkt_type {
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index 9cf9ab320c48..8bc3c7fc415f 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -287,13 +287,7 @@ static int nbp_switchdev_sync_objs(struct net_bridge_port *p, const void *ctx,
if (err && err != -EOPNOTSUPP)
return err;
- /* Forwarding and termination FDB entries on the port */
- err = br_fdb_replay(br_dev, dev, ctx, true, atomic_nb);
- if (err && err != -EOPNOTSUPP)
- return err;
-
- /* Termination FDB entries on the bridge itself */
- err = br_fdb_replay(br_dev, br_dev, ctx, true, atomic_nb);
+ err = br_fdb_replay(br_dev, ctx, true, atomic_nb);
if (err && err != -EOPNOTSUPP)
return err;
@@ -312,11 +306,7 @@ static void nbp_switchdev_unsync_objs(struct net_bridge_port *p,
br_mdb_replay(br_dev, dev, ctx, false, blocking_nb, NULL);
- /* Forwarding and termination FDB entries on the port */
- br_fdb_replay(br_dev, dev, ctx, false, atomic_nb);
-
- /* Termination FDB entries on the bridge itself */
- br_fdb_replay(br_dev, br_dev, ctx, false, atomic_nb);
+ br_fdb_replay(br_dev, ctx, false, atomic_nb);
}
/* Let the bridge know that this port is offloaded, so that it can assign a
--
2.25.1
Powered by blists - more mailing lists