lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220408200337.718067-7-vladimir.oltean@nxp.com>
Date:   Fri,  8 Apr 2022 23:03:37 +0300
From:   Vladimir Oltean <vladimir.oltean@....com>
To:     netdev@...r.kernel.org
Cc:     Jakub Kicinski <kuba@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Florian Fainelli <f.fainelli@...il.com>,
        Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Vladimir Oltean <olteanv@...il.com>,
        UNGLinuxDriver@...rochip.com, Paolo Abeni <pabeni@...hat.com>,
        Roopa Prabhu <roopa@...dia.com>,
        Nikolay Aleksandrov <nikolay@...dia.com>,
        Jiri Pirko <jiri@...dia.com>, Ido Schimmel <idosch@...dia.com>,
        Tobias Waldekranz <tobias@...dekranz.com>,
        Mattias Forsblad <mattias.forsblad@...il.com>
Subject: [PATCH net-next 6/6] net: bridge: avoid uselessly making offloaded ports promiscuous

The bridge driver's intention by making ports promiscuous is to turn off
their RX filters such that these ports receive packets with any MAC DA.

A quick survey of the kernel drivers that call
switchdev_bridge_port_offload() shows that either these do not implement
ndo_change_rx_flags() at all, or they explicitly ignore changes to
IFF_PROMISC (am65_cpsw_slave_set_promisc, cpsw_set_promiscious,
ocelot_set_rx_mode).

This makes sense, because hardware that is purpose-built to do L2
forwarding generally already knows it should accept any MAC DA on its
ports.

That is not to say that IFF_PROMISC makes no sense for switchdev drivers.
For example, DSA has the concept of multiple address databases (this is
achieved by effectively partitioning the FDB: reserve a database - FID -
for each port operating as standalone, a FID for each VLAN-unaware
bridge, a FID for each bridge VLAN). The address database of a
standalone port is managed through the standard dev->uc and dev->uc
lists and is used to filter towards the hosts the addresses required for
local termination. The bridge-related address databases are managed
using switchdev (SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE).

IFF_PROMISC is intrinsically connected to dev->uc and dev->mc (see the
implementation of __dev_set_rx_mode which puts the interface in
promiscuous mode if the unicast list isn't empty but the device doesn't
support IFF_UNICAST_FLT), and therefore to what DSA implements as the
standalone port address database (there, an entry in dev->uc means
"forward it to CPU", the absence of it means "drop it", and promiscuity
means "put the CPU in the flood mask of packets with unknown MAC DA").

Whereas there is no IFF_PROMISC equivalent to the FDB entries notified
through switchdev (therefore to the bridge-related address databases),
because none is needed.

In this model, the bridge driver, which is only trying to secure its
reception of packets, is in fact overstepping, because it manages
something which is outside of its competence: the host flooding of the
standalone port database, when in fact that database will not be the one
used by packets handled by the bridging service.

In turn, this prevents further optimizations from being applied in
particular to DSA, and in general to any switchdev driver. A desirable
goal is to eliminate host flooding of packets which are known to be
unnecessary and only dropped later in software [1].

In an ideal world with ideal hardware:
(a) flooding would be controlled per FID rather than per port
(b) egress flooding towards a certain port can be controlled
    independently depending on the actual port ingress port, rather than
    globally, regardless of ingress port

When (a) does not hold true, the bridge will force the port to keep host
flooding enabled, even if this is not otherwise needed (there is no
station behind a "foreign interface" that requires software forwarding;
the only packets sent by the accelerator to the CPU are for termination
purposes).

When (b) does not hold true, it means that a 4-port switch where 1 port
is standalone and 3 are bridged (again with no foreign interface) will
have host flooding enabled for all 4 ports (including the standalone
port, because the bridge is keeping host flooding enabled, and all ports
are serviced by the same CPU port).

Since DSA is a framework and not just a driver for a single device,
these nonidealities do hold true, and the bridge unnecessarily setting
IFF_PROMISC on its ports is a real roadblock towards disabling host
flooding in practical scenarios.

The proposed solution is to make the bridge driver stop touching port
promiscuity for offloaded switchdev ports, and let them manage
promiscuity by themselves as they see fit. It can achieve this by
looking at net_bridge_port :: offload_count, which is updated
voluntarily by switchdev drivers using switchdev_bridge_port_offload().

br_manage_promisc() is already called by nbp_update_port_count() on a
port join/leave, and the implicit assumption is that
switchdev_bridge_port_offload() has already been called by that time
(from netdev_master_upper_dev_link).

[1] https://www.youtube.com/watch?v=B1HhxEcU7Jg

Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
---
 net/bridge/br_if.c | 63 ++++++++++++++++++++++++++++------------------
 1 file changed, 39 insertions(+), 24 deletions(-)

diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 55f47cadb114..6ac5313e1cb8 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -135,34 +135,49 @@ static void br_port_clear_promisc(struct net_bridge_port *p)
 void br_manage_promisc(struct net_bridge *br)
 {
 	struct net_bridge_port *p;
-	bool set_all = false;
-
-	/* If vlan filtering is disabled or bridge interface is placed
-	 * into promiscuous mode, place all ports in promiscuous mode.
-	 */
-	if ((br->dev->flags & IFF_PROMISC) || !br_vlan_enabled(br->dev))
-		set_all = true;
 
 	list_for_each_entry(p, &br->port_list, list) {
-		if (set_all) {
+		/* Offloaded ports have a separate address database for
+		 * forwarding, which is managed through switchdev and not
+		 * through dev_uc_add(), so the promiscuous concept makes no
+		 * sense for them. Avoid updating promiscuity in that case.
+		 */
+		if (p->offload_count) {
+			br_port_clear_promisc(p);
+			continue;
+		}
+
+		/* If bridge is promiscuous, unconditionally place all ports
+		 * in promiscuous mode too. This allows the bridge device to
+		 * locally receive all unknown traffic.
+		 */
+		if (br->dev->flags & IFF_PROMISC) {
+			br_port_set_promisc(p);
+			continue;
+		}
+
+		/* If vlan filtering is disabled, place all ports in
+		 * promiscuous mode.
+		 */
+		if (!br_vlan_enabled(br->dev)) {
 			br_port_set_promisc(p);
-		} else {
-			/* If the number of auto-ports is <= 1, then all other
-			 * ports will have their output configuration
-			 * statically specified through fdbs.  Since ingress
-			 * on the auto-port becomes forwarding/egress to other
-			 * ports and egress configuration is statically known,
-			 * we can say that ingress configuration of the
-			 * auto-port is also statically known.
-			 * This lets us disable promiscuous mode and write
-			 * this config to hw.
-			 */
-			if (br->auto_cnt == 0 ||
-			    (br->auto_cnt == 1 && br_auto_port(p)))
-				br_port_clear_promisc(p);
-			else
-				br_port_set_promisc(p);
+			continue;
 		}
+
+		/* If the number of auto-ports is <= 1, then all other ports
+		 * will have their output configuration statically specified
+		 * through fdbs. Since ingress on the auto-port becomes
+		 * forwarding/egress to other ports and egress configuration is
+		 * statically known, we can say that ingress configuration of
+		 * the auto-port is also statically known.
+		 * This lets us disable promiscuous mode and write this config
+		 * to hw.
+		 */
+		if (br->auto_cnt == 0 ||
+		    (br->auto_cnt == 1 && br_auto_port(p)))
+			br_port_clear_promisc(p);
+		else
+			br_port_set_promisc(p);
 	}
 }
 
-- 
2.25.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ