[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250810163229.otapw4mhtv7e35jp@skbuf>
Date: Sun, 10 Aug 2025 19:32:29 +0300
From: Vladimir Oltean <olteanv@...il.com>
To: Daniel Golle <daniel@...rotopia.org>
Cc: Hauke Mehrtens <hauke@...ke-m.de>, Andrew Lunn <andrew@...n.ch>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Arkadi Sharshevsky <arkadis@...lanox.com>,
Florian Fainelli <f.fainelli@...il.com>, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org,
Andreas Schirm <andreas.schirm@...mens.com>,
Alexander Sverdlin <alexander.sverdlin@...mens.com>,
Lukas Stockmann <lukas.stockmann@...mens.com>,
John Crispin <john@...ozen.org>
Subject: Re: [PATCH/RFC net] net: dsa: lantiq_gswip: honor dsa_db passed to
port_fdb_{add,del}
On Sun, Aug 10, 2025 at 03:48:30PM +0100, Daniel Golle wrote:
> [ 66.300000] gswip 1e108000.switch: port 3 failed to add 6a:94:c2:xx:xx:xx vid 1 to fdb: -22
> [ 66.300000] gswip 1e108000.switch: port 3 failed to add 1a:f8:a8:xx:xx:xx vid 0 to fdb: -22
> [ 66.320000] gswip 1e108000.switch: port 3 failed to add 1a:f8:a8:xx:xx:xx vid 1 to fdb: -22
> [ 66.320000] gswip 1e108000.switch: port 3 failed to delete 6a:94:c2:xx:xx:xx vid 1 from fdb: -2
>
> So the problem is apparently that at the point of calling br_add_if() the
> port obviously isn't (yet) a member of the bridge and hence
> dsa_port_bridge_dev_get() would still return NULL at this point, which
> then causes gswip_port_fdb() to return -EINVAL.
Nope, this theory is false because the user port _is_ a member of the
bridge when it processes the SWITCHDEV_FDB_ADD_TO_DEVICE events.
There are 2 cases for handling these events. One is handling past events
which were missed and are re-generated during FDB replay:
[ 65.510000] [<807ed128>] dsa_user_fdb_event+0x110/0x1c8
[ 65.510000] [<807f89b8>] __switchdev_handle_fdb_event_to_device+0x138/0x228
[ 65.510000] [<807f8ad8>] switchdev_handle_fdb_event_to_device+0x30/0x48
[ 65.510000] [<807ec328>] dsa_user_switchdev_event+0x90/0xb0
[ 65.510000] [<807c43c0>] br_switchdev_fdb_replay+0xd0/0x138
[ 65.510000] [<807c4de8>] br_switchdev_port_offload+0x240/0x39c
[ 65.510000] [<80799b6c>] br_switchdev_blocking_event+0x80/0xec
[ 65.510000] [<80065e20>] raw_notifier_call_chain+0x48/0x88
[ 65.510000] [<807f83e0>] switchdev_bridge_port_offload+0x5c/0xd0
[ 65.510000] [<807e4c90>] dsa_port_bridge_join+0x170/0x410
[ 65.510000] [<807ed5fc>] dsa_user_changeupper.part.0+0x40/0x180
[ 65.510000] [<807f0ac0>] dsa_user_netdevice_event+0x5b4/0xc34
[ 65.510000] [<80065e20>] raw_notifier_call_chain+0x48/0x88
[ 65.510000] [<805edeec>] __netdev_upper_dev_link+0x1bc/0x450
[ 65.510000] [<805ee1dc>] netdev_master_upper_dev_link+0x2c/0x38
[ 65.510000] [<807a055c>] br_add_if+0x494/0x890
If you look at the order of operations, you'll see that:
int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br,
struct netlink_ext_ack *extack)
{
...
err = dsa_port_bridge_create(dp, br, extack); // this sets dp->bridge
if (err)
return err;
brport_dev = dsa_port_to_bridge_port(dp);
info.bridge = *dp->bridge;
err = dsa_broadcast(DSA_NOTIFIER_BRIDGE_JOIN, &info); // this calls ds->ops->port_bridge_join()
if (err)
goto out_rollback;
/* Drivers which support bridge TX forwarding should set this */
dp->bridge->tx_fwd_offload = info.tx_fwd_offload;
err = switchdev_bridge_port_offload(brport_dev, dev, dp,
&dsa_user_switchdev_notifier,
&dsa_user_switchdev_blocking_notifier,
dp->bridge->tx_fwd_offload, extack); // this calls br_switchdev_fdb_replay()
if (err)
goto out_rollback_unbridge;
...
}
by the time br_switchdev_fdb_replay() is called, dp->bridge correctly
reflects the bridge that is generating the events.
The problem is not a race condition, the problem is that the driver does
not correctly handle host FDB entries.
The truly revealing step is to uncomment this:
netdev_dbg(dev, "%s FDB entry towards %s, addr %pM vid %d%s\n",
event == SWITCHDEV_FDB_ADD_TO_DEVICE ? "Adding" : "Deleting",
orig_dev->name, fdb_info->addr, fdb_info->vid,
host_addr ? " as host address" : "");
and see it will print "as host address", meaning dsa_port_bridge_host_fdb_add()
will be called.
At the DSA cross-chip notifier layer, this generates a DSA_NOTIFIER_HOST_FDB_ADD
event rather than the port-level DSA_NOTIFIER_FDB_ADD. The major difference in
handling is that HOST_FDB_ADD events are matched by the *CPU* port, see
dsa_port_host_address_match().
The CPU port is not part of the bridge that generated the host FDB entry,
only the user port it services is.
The problem originates, roughly speaking, since commit 10fae4ac89ce
("net: dsa: include bridge addresses which are local in the host fdb
list"), which first appeared in v5.14. We rolled out a new feature using
existing API, and didn't notice the gswip driver wouldn't tolerate
ds->ops->port_fdb_add() getting called on the CPU port.
Anyway, your intuition for fixing this properly was somewhat correct.
Even if gswip does not implement FDB isolation, it is correct to look at
the dsa_db :: bridge member to see which bridge originated the host FDB
entry. That was its exact purpose, as documented in section "Address
databases" of Documentation/networking/dsa/dsa.rst. Even if cpu_dp->bridge
is NULL (as expected), drivers have to know on behalf of which user port
(member of a bridge) the CPU port is filtering an entry towards the host.
This is because CPU and DSA ports are servicing multiple address databases.
The only problem is that the API you're making use for fixing this was
introduced in commit c26933639b54 ("net: dsa: request drivers to perform
FDB isolation"), first appeared in v5.18. Thus, you can't fix an issue
in linux-5.15.y using this method, unless the API change is backported.
Alternatively, for stable kernels you could suppress the error and
return 0 in the gswip driver, with the appropriate comment that the API
doesn't communicate the bridge ID for host FDB entries, and thus, the
driver just won't filter them. Then, you could develop the solution
further for net-next, keeping in mind how things are supposed to work.
Powered by blists - more mailing lists