[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87im9q8i99.fsf@waldekranz.com>
Date: Sat, 28 Nov 2020 00:58:10 +0100
From: Tobias Waldekranz <tobias@...dekranz.com>
To: Vladimir Oltean <vladimir.oltean@....com>, netdev@...r.kernel.org
Cc: kuba@...nel.org, andrew@...n.ch, f.fainelli@...il.com,
vivien.didelot@...il.com
Subject: Re: [PATCH net] net: dsa: reference count the host mdb addresses
On Fri, Oct 16, 2020 at 00:27, Vladimir Oltean <vladimir.oltean@....com> wrote:
> Currently any DSA switch that implements the multicast ops (properly,
> that is) gets these errors after just sitting for a while, with at least
> 2 ports bridged:
>
> [ 286.013814] mscc_felix 0000:00:00.5 swp3: failed (err=-2) to del object (id=3)
>
> The reason has to do with this piece of code:
>
> netdev_for_each_lower_dev(dev, lower_dev, iter)
> br_mdb_switchdev_host_port(dev, lower_dev, mp, type);
>
> called from:
>
> br_multicast_group_expired
> -> br_multicast_host_leave
> -> br_mdb_notify
> -> br_mdb_switchdev_host
>
> Basically, that code is correct. It tells each switchdev port that the
> host can leave that multicast group. But in the case of DSA, all user
> ports are connected to the host through the same pipe. So, because DSA
> translates a host MDB to a normal MDB on the CPU port, this means that
> when all user ports leave a multicast group, DSA tries to remove it N
> times from the CPU port.
>
> We should be reference-counting these addresses.
That sounds like a good idea. We have run into another issue with the
MDB that maybe could be worked into this changeset. This is what we have
observed on 4.19, but from looking at the source it does not look like
anything has changed with respect to this issue.
The DSA driver handles the addition/removal of router ports by
enabling/disabling multicast flooding to the port in question. On
mv88e6xxx at least, this is only part of the solution. It only takes
care of the unregistered multicast. You also have to iterate through all
_registered_ groups and add the port to the destination vector.
If you do that the naïve way, you run in to a secondary problem. Without
any caching middle layer, you now have a single bit in the vector that
can be set either because the port is a router port, or because someone
has joined the group, _OR_ both (if the querier moves due to a topology
change for example). So you need a cache of the joined ports for each
group, and the router port vector. That way you can make sure to only
clear the bit if neither the cached group entry nor the router port
vector has the bit set.
If you think it could be useful I could try to rebase our internal
solution on net-next and post it as an RFC. Maybe there are at least
parts that can be used from it.
Powered by blists - more mailing lists