[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87a6d6ilrl.fsf@waldekranz.com>
Date: Thu, 31 Mar 2022 10:11:58 +0200
From: Tobias Waldekranz <tobias@...dekranz.com>
To: Vladimir Oltean <olteanv@...il.com>,
Mattias Forsblad <mattias.forsblad@...il.com>
Cc: netdev@...r.kernel.org, "David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Andrew Lunn <andrew@...n.ch>,
Florian Fainelli <f.fainelli@...il.com>,
Vivien Didelot <vivien.didelot@...il.com>,
Roopa Prabhu <roopa@...dia.com>,
Nikolay Aleksandrov <razor@...ckwall.org>,
Joachim Wiberg <troglobit@...il.com>,
Ido Schimmel <idosch@...sch.org>,
"Allan W. Nielsen" <allan.nielsen@...rochip.com>,
UNGLinuxDriver@...rochip.com
Subject: Re: [PATCH 4/5] mv88e6xxx: Offload the flood flag
On Thu, Mar 17, 2022 at 15:05, Vladimir Oltean <olteanv@...il.com> wrote:
> On Thu, Mar 17, 2022 at 07:50:30AM +0100, Mattias Forsblad wrote:
>> Use the port vlan table to restrict ingressing traffic to the
>> CPU port if the flood flags are cleared.
>>
>> Signed-off-by: Mattias Forsblad <mattias.forsblad@...il.com>
>> ---
>
> There is a grave mismatch between what this patch says it does and what
> it really does. (=> NACK)
>
> Doing some interpolation from previous commit descriptions, the
> intention is to disable flooding from a given port towards the CPU
> (which, I mean, is fair enough as a goal).
>
> But:
> (a) mv88e6xxx_port_vlan() disables _forwarding_ from port A to port B.
> So this affects not only unknown traffic (the one which is flooded),
> but all traffic
> (b) even if br_flood_enabled() is false (meaning that the bridge device
> doesn't want to locally process flooded packets), there is no
> equality sign between this and disabling flooding on the CPU port.
> If the DSA switch is bridged with a foreign (non-DSA) interface, be
> it a tap, a Wi-Fi AP, or a plain Ethernet port, then from the
> switch's perspective, this is no different from a local termination
> flow (packets need to be forwarded to the CPU). Yet from the
> bridge's perspective, it is a forwarding and not a termination flow.
> So you can't _just_ disable CPU flooding/forwarding when the bridge
> doesn't want to locally terminate traffic.
>
> Regarding (b), I've CC'ed Allan Nielsen who held this presentation a few
> years ago, and some ideas were able to be materialized in the meantime:
> https://www.youtube.com/watch?v=B1HhxEcU7Jg
>
> Regarding (a), have you seen the new dsa_port_manage_cpu_flood() from
> the DSA unicast filtering patch series?
> https://patchwork.kernel.org/project/netdevbpf/patch/20220302191417.1288145-6-vladimir.oltean@nxp.com/
> It is incomplete work in the sense that
>
> (1) it disables CPU flooding only if there isn't any port with IFF_PROMISC,
> but the bridge puts all ports in promiscuous mode. I think we can
> win that battle here, and convince bridge/switchdev maintainers to
> not put offloaded bridge ports (those that call switchdev_bridge_port_offload)
> in promiscuous mode, since it serves no purpose and that actively
> bothers us. At least the way DSA sees this is that unicast filtering
> and promiscuous mode deal with standalone mode. The forwarding plane
> is effectively a different address database and there is no direct
> equivalent to promiscuity there.
>
> (2) Right now DSA calls ->port_bridge_flags() from dsa_port_manage_cpu_flood(),
> i.e. it treats CPU flooding as a purely per-port-egress setting.
> But once I manage to straighten some kinks in DSA's unicast
> filtering support for switches with ds->vlan_filtering_is_global (in
> other words, make sja1105 eligible for unicast filtering), I pretty
> much plan to change this by making DSA ask the driver to manage CPU
> flooding per user port - leaving this code path as just a fallback.
>
> As baroque as I consider the sja1105 hardware to be, I'm surprised it
> has a feature which mv88e6xxx doesn't seem to - which is having flood
> controls per {ingress port, egress port} pair. So we'll have to
> improvise here.
>
> Could you tell me - ok, you remove the CPU port from the port VLAN map -
> but if you install host FDB entries as ACL entries (so as to make the
> switch generate a TO_CPU packet instead of a FORWARD packet), doesn't
> the switch in fact send packets to the CPU even in lack of the CPU
> port's membership in the port VLAN table for the bridge port?
>
> If I'm right and it does, then I do see a path forward for this, with
> zero user space additions, and working by default. We make the bridge
> stop uselessly making offloaded DSA bridge ports promiscuous, then we
> make DSA manage CPU flooding by itself - taking promiscuity into account
> but also foreign interfaces joining/leaving. Then we make host addresses
> be delivered by mv88e6xxx to the CPU as trapped and not forwarded, then
> from new the DSA ->port_set_cpu_flood() callback we remove the CPU port
> from the port VLAN table.
>
> What do you think?
It's an interesting idea. For unicast entries you could maybe get away
with it. Though, it would mean that we would be limited to assisted CPU
learning, since there is no way for the switch to autonomously generate
ACL entries ("Policy entries" in ATU parlance). By extension, this also
means that the Learn2All functionality goes out the window for multichip
setups for addresses associated with the CPU.
For multicast though, I'm not sure that it would work in a multichip
system. As you say a policy entry will be sent with a TO_CPU tag, the
problem is that I think that applies to all DSA ports. So in this
system:
CPU
|
.--0--. .-----.
| sw0 3---0 sw1 |
'-1-2-' '-1-2-'
If we have a multicast group with subscribers behind sw0p{0,2} and
sw1p2, we need the following ATU entries:
sw0:
da:01:00:5e:01:02:03 vid:0 state:policy dpv:0,2,3
sw1:
da:01:00:5e:01:02:03 vid:0 state:policy dpv:0,2
When this group ingresses on sw0p1, I suspect it will egress
sw0p{0,2,3}, but on ingress at sw1p0 the frame will be dropped since it
will contain a TO_CPU tag (and sw1's CPU port is the ingress port).
Similarly, when this group ingresses on sw1p1, it will egress sw1p{0,2},
but since it is tagged with TO_CPU on ingress to sw0p3, it won't reach
sw0p2.
>> drivers/net/dsa/mv88e6xxx/chip.c | 45 ++++++++++++++++++++++++++++++--
>> 1 file changed, 43 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
>> index 84b90fc36c58..39347a05c3a5 100644
>> --- a/drivers/net/dsa/mv88e6xxx/chip.c
>> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
>> @@ -1384,6 +1384,7 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port)
>> struct dsa_switch *ds = chip->ds;
>> struct dsa_switch_tree *dst = ds->dst;
>> struct dsa_port *dp, *other_dp;
>> + bool flood = true;
>> bool found = false;
>> u16 pvlan;
>>
>> @@ -1425,6 +1426,9 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port)
>>
>> pvlan = 0;
>>
>> + if (dp->bridge)
>> + flood = br_flood_enabled(dp->bridge->dev);
>> +
>> /* Frames from standalone user ports can only egress on the
>> * upstream port.
>> */
>> @@ -1433,10 +1437,11 @@ static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port)
>>
>> /* Frames from bridged user ports can egress any local DSA
>> * links and CPU ports, as well as any local member of their
>> - * bridge group.
>> + * as well as any local member of their bridge group. However, CPU ports
>> + * are omitted if flood is cleared.
>> */
>> dsa_switch_for_each_port(other_dp, ds)
>> - if (other_dp->type == DSA_PORT_TYPE_CPU ||
>> + if ((other_dp->type == DSA_PORT_TYPE_CPU && flood) ||
>> other_dp->type == DSA_PORT_TYPE_DSA ||
>> dsa_port_bridge_same(dp, other_dp))
>> pvlan |= BIT(other_dp->index);
>> @@ -2718,6 +2723,41 @@ static void mv88e6xxx_crosschip_bridge_leave(struct dsa_switch *ds,
>> mv88e6xxx_reg_unlock(chip);
>> }
>>
>> +static int mv88e6xxx_set_flood(struct dsa_switch *ds, int port, struct net_device *br,
>> + unsigned long mask, unsigned long val)
>> +{
>> + struct mv88e6xxx_chip *chip = ds->priv;
>> + struct dsa_bridge *bridge;
>> + struct dsa_port *dp;
>> + bool found = false;
>> + int err;
>> +
>> + if (!netif_is_bridge_master(br))
>> + return 0;
>> +
>> + list_for_each_entry(dp, &ds->dst->ports, list) {
>> + if (dp->ds == ds && dp->index == port) {
>> + found = true;
>> + break;
>> + }
>> + }
>> +
>> + if (!found)
>> + return 0;
>> +
>> + bridge = dp->bridge;
>> + if (!bridge)
>> + return 0;
>> +
>> + mv88e6xxx_reg_lock(chip);
>> +
>> + err = mv88e6xxx_bridge_map(chip, *bridge);
>> +
>> + mv88e6xxx_reg_unlock(chip);
>> +
>> + return err;
>> +}
>> +
>> static int mv88e6xxx_software_reset(struct mv88e6xxx_chip *chip)
>> {
>> if (chip->info->ops->reset)
>> @@ -6478,6 +6518,7 @@ static const struct dsa_switch_ops mv88e6xxx_switch_ops = {
>> .set_eeprom = mv88e6xxx_set_eeprom,
>> .get_regs_len = mv88e6xxx_get_regs_len,
>> .get_regs = mv88e6xxx_get_regs,
>> + .set_flood = mv88e6xxx_set_flood,
>> .get_rxnfc = mv88e6xxx_get_rxnfc,
>> .set_rxnfc = mv88e6xxx_set_rxnfc,
>> .set_ageing_time = mv88e6xxx_set_ageing_time,
>> --
>> 2.25.1
>>
Powered by blists - more mailing lists