[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210804135436.1741856-8-vladimir.oltean@nxp.com>
Date: Wed, 4 Aug 2021 16:54:35 +0300
From: Vladimir Oltean <vladimir.oltean@....com>
To: netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>,
"David S. Miller" <davem@...emloft.net>
Cc: Andrew Lunn <andrew@...n.ch>,
Florian Fainelli <f.fainelli@...il.com>,
Vivien Didelot <vivien.didelot@...il.com>,
Vladimir Oltean <olteanv@...il.com>
Subject: [PATCH v3 net-next 7/8] net: dsa: sja1105: suppress TX packets from looping back in "H" topologies
H topologies like this one have a problem:
eth0 eth1
| |
CPU port CPU port
| DSA link |
sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 -------- sw1p4 sw1p3 sw1p2 sw1p1 sw1p0
| | | | | |
user user user user user user
port port port port port port
Basically any packet sent by the eth0 DSA master can be flooded on the
interconnecting DSA link sw0p4 <-> sw1p4 and it will be received by the
eth1 DSA master too. Basically we are talking to ourselves.
In VLAN-unaware mode, these packets are encoded using a tag_8021q TX
VLAN, which dsa_8021q_rcv() rightfully cannot decode and complains.
Whereas in VLAN-aware mode, the packets are encoded with a bridge VLAN
which _can_ be decoded by the tagger running on eth1, so it will attempt
to reinject that packet into the network stack (the bridge, if there is
any port under eth1 that is under a bridge). In the case where the ports
under eth1 are under the same cross-chip bridge as the ports under eth0,
the TX packets will even be learned as RX packets. The only thing that
will prevent loops with the software bridging path, and therefore
disaster, is that the source port and the destination port are in the
same hardware domain, and the bridge will receive packets from the
driver with skb->offload_fwd_mark = true and will not forward between
the two.
The proper solution to this problem is to detect H topologies and
enforce that all packets are received through the local switch and we do
not attempt to receive packets on our CPU port from switches that have
their own. This is a viable solution which works thanks to the fact that
MAC addresses which should be filtered towards the host are installed by
DSA as static MAC addresses towards the CPU port of each switch.
TX from a CPU port towards the DSA port continues to be allowed, this is
because sja1105 supports bridge TX forwarding offload, and the skb->dev
used initially for xmit does not have any direct correlation with where
the station that will respond to that packet is connected. It may very
well happen that when we send a ping through a br0 interface that spans
all switch ports, the xmit packet will exit the system through a DSA
switch interface under eth1 (say sw1p2), but the destination station is
connected to a switch port under eth0, like sw0p0. So the switch under
eth1 needs to communicate on TX with the switch under eth0. The
response, however, will not follow the same path, but instead, this
patch enforces that the response is sent by the first switch directly to
its DSA master which is eth0.
Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
---
drivers/net/dsa/sja1105/sja1105_main.c | 29 ++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index fffcaef6b148..b3b5ae3ef408 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -474,7 +474,9 @@ static int sja1105_init_l2_forwarding(struct sja1105_private *priv)
{
struct sja1105_l2_forwarding_entry *l2fwd;
struct dsa_switch *ds = priv->ds;
+ struct dsa_switch_tree *dst;
struct sja1105_table *table;
+ struct dsa_link *dl;
int port, tc;
int from, to;
@@ -547,6 +549,33 @@ static int sja1105_init_l2_forwarding(struct sja1105_private *priv)
}
}
+ /* In odd topologies ("H" connections where there is a DSA link to
+ * another switch which also has its own CPU port), TX packets can loop
+ * back into the system (they are flooded from CPU port 1 to the DSA
+ * link, and from there to CPU port 2). Prevent this from happening by
+ * cutting RX from DSA links towards our CPU port, if the remote switch
+ * has its own CPU port and therefore doesn't need ours for network
+ * stack termination.
+ */
+ dst = ds->dst;
+
+ list_for_each_entry(dl, &dst->rtable, list) {
+ if (dl->dp->ds != ds || dl->link_dp->cpu_dp == dl->dp->cpu_dp)
+ continue;
+
+ from = dl->dp->index;
+ to = dsa_upstream_port(ds, from);
+
+ dev_warn(ds->dev,
+ "H topology detected, cutting RX from DSA link %d to CPU port %d to prevent TX packet loops\n",
+ from, to);
+
+ sja1105_port_allow_traffic(l2fwd, from, to, false);
+
+ l2fwd[from].bc_domain &= ~BIT(to);
+ l2fwd[from].fl_domain &= ~BIT(to);
+ }
+
/* Finally, manage the egress flooding domain. All ports start up with
* flooding enabled, including the CPU port and DSA links.
*/
--
2.25.1
Powered by blists - more mailing lists