lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed,  9 Feb 2022 23:30:42 +0200
From:   Vladimir Oltean <vladimir.oltean@....com>
To:     netdev@...r.kernel.org
Cc:     Jakub Kicinski <kuba@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Florian Fainelli <f.fainelli@...il.com>,
        Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Vladimir Oltean <olteanv@...il.com>,
        Roopa Prabhu <roopa@...dia.com>,
        Nikolay Aleksandrov <nikolay@...dia.com>,
        Jiri Pirko <jiri@...dia.com>, Ido Schimmel <idosch@...dia.com>,
        Rafael Richter <rafael.richter@....de>,
        Daniel Klauer <daniel.klauer@....de>,
        Tobias Waldekranz <tobias@...dekranz.com>
Subject: [RFC PATCH net-next 4/5] net: bridge: switchdev: replay VLANs present on the bridge device itself

The mv88e6xxx DSA driver currently prints this error message as soon as
the first port of a switch joins a bridge:

mv88e6085 0x0000000008b96000:00: port 0 failed to add a6:ef:77:c8:5f:3d vid 1 to fdb: -95

where a6:ef:77:c8:5f:3d vid 1 is a local FDB entry corresponding to the
bridge MAC address in the default_pvid.

The -EOPNOTSUPP is returned by mv88e6xxx_port_db_load_purge() because it
tries to map VID 1 to a FID (the ATU is indexed by FID not VID), but
fails to do so. This is because ->port_fdb_add() is called before
->port_vlan_add() for VID 1.

The abridged timeline of the calls is:

br_add_if
-> netdev_master_upper_dev_link
   -> dsa_port_bridge_join
      -> switchdev_bridge_port_offload
         -> br_switchdev_vlan_replay (*)
         -> br_switchdev_fdb_replay
            -> mv88e6xxx_port_fdb_add
-> nbp_vlan_init
   -> nbp_vlan_add
      -> mv88e6xxx_port_vlan_add

and the issue is that at the time of (*), the bridge port isn't in VID 1
(nbp_vlan_init hasn't been called), therefore br_switchdev_vlan_replay()
won't have anything to replay, therefore VID 1 won't be in the VTU by
the time mv88e6xxx_port_fdb_add() is called.

This happens only when the first port of a switch joins. For further
ports, the initial mv88e6xxx_port_vlan_add() is sufficient for VID 1 to
be loaded in the VTU (which is switch-wide, not per port).

The problem is somewhat unique to mv88e6xxx by chance, because most
other drivers offload an FDB entry by VID, so FDBs and VLANs can be
added asynchronously with respect to each other, but addressing the
issue at the bridge layer makes sense, since what mv88e6xxx requires
isn't absurd.

To fix this problem, we need to recognize that it isn't the VLAN group
of the port that we're interested in, but the VLAN group of the bridge
itself (so it isn't a timing issue, but rather insufficient information
being passed from switchdev to drivers).

Currently nbp_switchdev_sync_objs() only calls br_switchdev_vlan_replay()
for VLANs corresponding to the port, but the VLANs corresponding to the
bridge itself, for local termination, also need to be replayed.
In this case, VID 1 is not (yet) present in the port's VLAN group but is
present in the bridge's VLAN group.

Call br_switchdev_vlan_replay() once more, with the br_dev passed as the
"dev" argument. The function is already prepared to deal with this and
search in the right VLAN group, just that the functionality hasn't been
utilized yet.

Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
---
 net/bridge/br_switchdev.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index f8fbaaa7c501..d2d59fcfc0d4 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -681,11 +681,18 @@ static int nbp_switchdev_sync_objs(struct net_bridge_port *p, const void *ctx,
 	struct net_device *dev = p->dev;
 	int err;
 
+	/* Port VLANs */
 	err = br_switchdev_vlan_replay(br_dev, dev, ctx, true, blocking_nb,
 				       extack);
 	if (err && err != -EOPNOTSUPP)
 		return err;
 
+	/* And VLANs on the bridge device */
+	err = br_switchdev_vlan_replay(br_dev, br_dev, ctx, true, blocking_nb,
+				       extack);
+	if (err && err != -EOPNOTSUPP)
+		return err;
+
 	err = br_switchdev_mdb_replay(br_dev, dev, ctx, true, blocking_nb,
 				      extack);
 	if (err && err != -EOPNOTSUPP)
@@ -708,6 +715,8 @@ static void nbp_switchdev_unsync_objs(struct net_bridge_port *p,
 
 	br_switchdev_vlan_replay(br_dev, dev, ctx, false, blocking_nb, NULL);
 
+	br_switchdev_vlan_replay(br_dev, br_dev, ctx, false, blocking_nb, NULL);
+
 	br_switchdev_mdb_replay(br_dev, dev, ctx, false, blocking_nb, NULL);
 
 	br_switchdev_fdb_replay(br_dev, ctx, false, atomic_nb);
-- 
2.25.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ