[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250118011803.xqlvdzizpwnytii3@skbuf>
Date: Sat, 18 Jan 2025 03:18:03 +0200
From: Vladimir Oltean <olteanv@...il.com>
To: Tim Harvey <tharvey@...eworks.com>
Cc: Tristram.Ha@...rochip.com, Arun.Ramadoss@...rochip.com, andrew@...n.ch,
davem@...emloft.net, Woojung.Huh@...rochip.com,
linux-kernel@...r.kernel.org, pabeni@...hat.com,
netdev@...r.kernel.org, edumazet@...gle.com,
UNGLinuxDriver@...rochip.com, kuba@...nel.org
Subject: Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering
On Fri, Jan 17, 2025 at 01:02:31PM -0800, Tim Harvey wrote:
> The flaw with that patch is that enabling the multicast address table
> invokes other default rules in the table that need to be re-configured
> for the cpu port but the patch only configures group 0
> (01-80-C2-00-00-00). It fails to configure group 6 (01-80-C2-00-00-08)
> which is also used for STP so I would argue that it doesn't even do
> what the commit log says it does. It also has the side effect of
> disabling forwarding of other groups that were previously forwarded:
> - group 1 01-80-C2-00-00-01 (MAC Control Frame) (previously were
> forwarded, now are dropped)
> - group 2 01-80-C2-00-00-03 (802.1X access control) (previously were
> forwarded, now are forwarded to the highest port which may not be the
> cpu port)
> - group 4 01-80-C2-00-00-20 (GMRP) (previously were forwarded, now
> forwarded to all except the highest port number which may not be the
> cpu port)
> - group 5 01-80-C2-00-00-21 (GVRP) (previously were forwarded, now
> forwarded to all except the highest port number which may not be the
> cpu port)
> - group 6 01-80-C2-00-00-02, 01-80-C2-00-00-04 - 01-80-C2-00-00-0F
> (previously were forwarded, now are forwarded to the highest port
> which may not be the cpu port)
> - group 7 01-80-C2-00-00-11 - 01-80-C2-00-00-1F, 01-80-C2-00-00-22 -
> 01-80-C2-00-00-2F (previously were forwarded, now forwarded to all
> except the highest port number which may not be the cpu port)
> To fix this, I propose adding a function to configure each of the
> above groups (which are hardware filtering functions of the switch)
> with proper port masks but I need to know from the DSA experts what is
> desired for the port mask of those groups. The multicast address table
> can only invoke rules based on those groups of addresses so if that is
> not flexible enough then the multicast address table should instead be
> disabled.
The recommendation from the DSA maintainers will be to follow what the
software bridge data path does, which just means testing and seeing how
each group reacts to the known inputs which might affect it, i.e.:
- is it a group of the form 01-80-c2-00-00-0X? if yes, group_fwd_mask
should dictate how it is forwarded by software. All that hardware
needs to take care of is to send it just to the CPU.
- is multicast flooding enabled on the egress port?
- is there an MDB entry towards the egress port? how about another port?
The groups outside the 01-80-c2-00-00-0X range should be treated as
regular multicast, i.e. group_fwd_mask doesn't matter, and mdb/flooding
does.
One easy way out, if synchronizing the hardware port masks with the
software state turns out too hard, is to configure the switch to send
all these groups just to the CPU, and make sure skb->offload_fwd_mark is
unset for packets belonging to these groups (don't call
dsa_default_offload_fwd_mark() from the tagger). The software takes this
as a cue that it should forward them where the hardware didn't reach.
Also, never exclude the CPU port from the destination port mask, unless
you really, really know what you're doing. The software bridge might
need to forward to another foreign (non-switch) bridge port which is an
Intel e1000 card, or a Wi-Fi AP, or a tunnel, and by cutting out the CPU
from the flood path, you're taking that possibility away from it.
Here's a script to get you started with testing.
#!/bin/bash
ARP=" \
ff:ff:ff:ff:ff:ff 00:00:de:ad:be:ef 08 06 00 01 \
08 00 06 04 00 01 e0 07 1b 81 13 40 c0 a8 01 ad \
00 00 00 00 00 00 c0 a8 01 ea"
groups=( \
01:80:C2:00:00:00 \
01:80:C2:00:00:08 \
01:80:C2:00:00:01 \
01:80:C2:00:00:03 \
01:80:C2:00:00:20 \
01:80:C2:00:00:21 \
01:80:C2:00:00:02 \
01:80:C2:00:00:04 \
01:80:C2:00:00:0F \
01:80:C2:00:00:11 \
01:80:C2:00:00:1F \
01:80:C2:00:00:22 \
01:80:C2:00:00:2F \
)
pkt_count=1000
mac_get()
{
local if_name=$1
ip -j link show dev $if_name | jq -r '.[]["address"]'
}
get_rx_stats()
{
local if_name=$1
ip -j -s link show $if_name | jq '.[].stats64.rx.packets'
}
last_nibble()
{
local macaddr=$1
echo "0x${macaddr:0-1}"
}
send_raw()
{
local if_name=$1; shift
local group=$1; shift
local pkt="$1"; shift
local smac=$(mac_get $if_name)
pkt="${pkt/ff:ff:ff:ff:ff:ff/$group}"
pkt="${pkt/00:00:de:ad:be:ef/$smac}"
mausezahn -c $pkt_count -q $if_name "$pkt"
}
run_test()
{
before=$(get_rx_stats veth4)
send_raw veth0 $group "$ARP"
after=$(get_rx_stats veth4)
delta=$((after - before))
[ $delta -ge $pkt_count ] && echo "forwarded" || echo "not forwarded"
}
# br0
# / | \
# / | \
# / | \
# / | \
# veth1 veth3 veth5
# | | |
# veth0 veth2 veth4
ip link add veth0 type veth peer name veth1
ip link add veth2 type veth peer name veth3
ip link add veth4 type veth peer name veth5
ip link add br0 type bridge && ip link set br0 up
ip link set veth1 master br0 && ip link set veth1 up
ip link set veth3 master br0 && ip link set veth3 up
ip link set veth5 master br0 && ip link set veth5 up
ip link set veth0 up && ip link set veth2 up && ip link set veth4 up
for group in "${groups[@]}"; do
ip link set veth5 type bridge_slave mcast_flood on
with_flooding=$(run_test $group)
ip link set veth5 type bridge_slave mcast_flood off
without_flooding=$(run_test $group)
bridge mdb add dev br0 port veth5 grp $group permanent
with_mdb_and_no_flooding=$(run_test $group)
bridge mdb del dev br0 port veth5 grp $group permanent # restore
ip link set veth5 type bridge_slave mcast_flood on # restore
bridge mdb add dev br0 port veth3 grp $group permanent
with_mdb_on_another_port=$(run_test $group)
bridge mdb del dev br0 port veth3 grp $group permanent # restore
ip link set br0 type bridge group_fwd_mask $((1 << $(last_nibble $group))) 2>/dev/null
if [ $? = 0 ]; then
with_group_fwd_mask=$(run_test $group)
ip link set br0 type bridge group_fwd_mask 0 # restore
else
with_group_fwd_mask="can't test"
fi
printf "Group %s: %s with flooding, %s without flooding, %s with mdb and no flooding, %s with mdb on another port and flooding, %s with group_fwd_mask\n" \
"$group" \
"$with_flooding" \
"$without_flooding" \
"$with_mdb_and_no_flooding" \
"$with_mdb_on_another_port" \
"$with_group_fwd_mask" \
done
ip link del veth0
ip link del veth2
ip link del veth4
ip link del br0
Powered by blists - more mailing lists