lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJ+vNU1R7zCsEGoM2LP5XDaTseLxiTaa+jD1STcW9ZNgaO16Sw@mail.gmail.com>
Date: Wed, 22 Jan 2025 17:48:51 -0800
From: Tim Harvey <tharvey@...eworks.com>
To: Vladimir Oltean <olteanv@...il.com>
Cc: Tristram.Ha@...rochip.com, Arun.Ramadoss@...rochip.com, andrew@...n.ch, 
	davem@...emloft.net, Woojung.Huh@...rochip.com, linux-kernel@...r.kernel.org, 
	pabeni@...hat.com, netdev@...r.kernel.org, edumazet@...gle.com, 
	UNGLinuxDriver@...rochip.com, kuba@...nel.org
Subject: Re: [PATCH net] net: dsa: microchip: ksz9477: fix multicast filtering

On Fri, Jan 17, 2025 at 5:18 PM Vladimir Oltean <olteanv@...il.com> wrote:
>
> On Fri, Jan 17, 2025 at 01:02:31PM -0800, Tim Harvey wrote:
> > The flaw with that patch is that enabling the multicast address table
> > invokes other default rules in the table that need to be re-configured
> > for the cpu port but the patch only configures group 0
> > (01-80-C2-00-00-00). It fails to configure group 6 (01-80-C2-00-00-08)
> > which is also used for STP so I would argue that it doesn't even do
> > what the commit log says it does. It also has the side effect of
> > disabling forwarding of other groups that were previously forwarded:
> > - group 1 01-80-C2-00-00-01 (MAC Control Frame) (previously were
> > forwarded, now are dropped)
> > - group 2 01-80-C2-00-00-03 (802.1X access control) (previously were
> > forwarded, now are forwarded to the highest port which may not be the
> > cpu port)
> > - group 4 01-80-C2-00-00-20 (GMRP) (previously were forwarded, now
> > forwarded to all except the highest port number which may not be the
> > cpu port)
> > - group 5 01-80-C2-00-00-21 (GVRP) (previously were forwarded, now
> > forwarded to all except the highest port number which may not be the
> > cpu port)
> > - group 6 01-80-C2-00-00-02, 01-80-C2-00-00-04 - 01-80-C2-00-00-0F
> > (previously were forwarded, now are forwarded to the highest port
> > which may not be the cpu port)
> > - group 7 01-80-C2-00-00-11 - 01-80-C2-00-00-1F, 01-80-C2-00-00-22 -
> > 01-80-C2-00-00-2F (previously were forwarded, now forwarded to all
> > except the highest port number which may not be the cpu port)
>
> > To fix this, I propose adding a function to configure each of the
> > above groups (which are hardware filtering functions of the switch)
> > with proper port masks but I need to know from the DSA experts what is
> > desired for the port mask of those groups. The multicast address table
> > can only invoke rules based on those groups of addresses so if that is
> > not flexible enough then the multicast address table should instead be
> > disabled.
>
> The recommendation from the DSA maintainers will be to follow what the
> software bridge data path does, which just means testing and seeing how
> each group reacts to the known inputs which might affect it, i.e.:
>
> - is it a group of the form 01-80-c2-00-00-0X? if yes, group_fwd_mask
>   should dictate how it is forwarded by software. All that hardware
>   needs to take care of is to send it just to the CPU.
>
> - is multicast flooding enabled on the egress port?
>
> - is there an MDB entry towards the egress port? how about another port?
>   The groups outside the 01-80-c2-00-00-0X range should be treated as
>   regular multicast, i.e. group_fwd_mask doesn't matter, and mdb/flooding
>   does.
>
> One easy way out, if synchronizing the hardware port masks with the
> software state turns out too hard, is to configure the switch to send
> all these groups just to the CPU, and make sure skb->offload_fwd_mark is
> unset for packets belonging to these groups (don't call
> dsa_default_offload_fwd_mark() from the tagger). The software takes this
> as a cue that it should forward them where the hardware didn't reach.
>
> Also, never exclude the CPU port from the destination port mask, unless
> you really, really know what you're doing. The software bridge might
> need to forward to another foreign (non-switch) bridge port which is an
> Intel e1000 card, or a Wi-Fi AP, or a tunnel, and by cutting out the CPU
> from the flood path, you're taking that possibility away from it.
>
> Here's a script to get you started with testing.
>
> #!/bin/bash
>
> ARP=" \
> ff:ff:ff:ff:ff:ff 00:00:de:ad:be:ef 08 06 00 01 \
> 08 00 06 04 00 01 e0 07 1b 81 13 40 c0 a8 01 ad \
> 00 00 00 00 00 00 c0 a8 01 ea"
> groups=( \
>         01:80:C2:00:00:00 \
>         01:80:C2:00:00:08 \
>         01:80:C2:00:00:01 \
>         01:80:C2:00:00:03 \
>         01:80:C2:00:00:20 \
>         01:80:C2:00:00:21 \
>         01:80:C2:00:00:02 \
>         01:80:C2:00:00:04 \
>         01:80:C2:00:00:0F \
>         01:80:C2:00:00:11 \
>         01:80:C2:00:00:1F \
>         01:80:C2:00:00:22 \
>         01:80:C2:00:00:2F \
> )
> pkt_count=1000
>
> mac_get()
> {
>         local if_name=$1
>
>         ip -j link show dev $if_name | jq -r '.[]["address"]'
> }
>
> get_rx_stats()
> {
>         local if_name=$1
>
>         ip -j -s link show $if_name | jq '.[].stats64.rx.packets'
> }
>
> last_nibble()
> {
>         local macaddr=$1
>
>         echo "0x${macaddr:0-1}"
> }
>
> send_raw()
> {
>         local if_name=$1; shift
>         local group=$1; shift
>         local pkt="$1"; shift
>         local smac=$(mac_get $if_name)
>
>         pkt="${pkt/ff:ff:ff:ff:ff:ff/$group}"
>         pkt="${pkt/00:00:de:ad:be:ef/$smac}"
>
>         mausezahn -c $pkt_count -q $if_name "$pkt"
> }
>
> run_test()
> {
>         before=$(get_rx_stats veth4)
>         send_raw veth0 $group "$ARP"
>         after=$(get_rx_stats veth4)
>         delta=$((after - before))
>
>         [ $delta -ge $pkt_count ] && echo "forwarded" || echo "not forwarded"
> }
>
> #          br0
> #        /  |  \
> #       /   |   \
> #      /    |    \
> #     /     |     \
> #  veth1  veth3  veth5
> #    |      |      |
> #  veth0  veth2  veth4
> ip link add veth0 type veth peer name veth1
> ip link add veth2 type veth peer name veth3
> ip link add veth4 type veth peer name veth5
> ip link add br0 type bridge && ip link set br0 up
> ip link set veth1 master br0 && ip link set veth1 up
> ip link set veth3 master br0 && ip link set veth3 up
> ip link set veth5 master br0 && ip link set veth5 up
> ip link set veth0 up && ip link set veth2 up && ip link set veth4 up
>
> for group in "${groups[@]}"; do
>         ip link set veth5 type bridge_slave mcast_flood on
>         with_flooding=$(run_test $group)
>
>         ip link set veth5 type bridge_slave mcast_flood off
>         without_flooding=$(run_test $group)
>
>         bridge mdb add dev br0 port veth5 grp $group permanent
>         with_mdb_and_no_flooding=$(run_test $group)
>         bridge mdb del dev br0 port veth5 grp $group permanent # restore
>
>         ip link set veth5 type bridge_slave mcast_flood on # restore
>
>         bridge mdb add dev br0 port veth3 grp $group permanent
>         with_mdb_on_another_port=$(run_test $group)
>         bridge mdb del dev br0 port veth3 grp $group permanent # restore
>
>         ip link set br0 type bridge group_fwd_mask $((1 << $(last_nibble $group))) 2>/dev/null
>         if [ $? = 0 ]; then
>                 with_group_fwd_mask=$(run_test $group)
>                 ip link set br0 type bridge group_fwd_mask 0 # restore
>         else
>                 with_group_fwd_mask="can't test"
>         fi
>
>         printf "Group %s: %s with flooding, %s without flooding, %s with mdb and no flooding, %s with mdb on another port and flooding, %s with group_fwd_mask\n" \
>                 "$group" \
>                 "$with_flooding" \
>                 "$without_flooding" \
>                 "$with_mdb_and_no_flooding" \
>                 "$with_mdb_on_another_port" \
>                 "$with_group_fwd_mask" \
>
> done
>
> ip link del veth0
> ip link del veth2
> ip link del veth4
> ip link del br0

Hi Vladimir,

Here is the output of your script with Linux 6.13:
Group 01:80:C2:00:00:00: forwarded with flooding, not forwarded
without flooding, forwarded with mdb and no flooding, not forwarded
with mdb on another port and flooding, can't test with group_fwd_mask
Group 01:80:C2:00:00:08: not forwarded with flooding, not forwarded
without flooding, not forwarded with mdb and no flooding, not
forwarded with mdb on another port and flooding, forwarded with
group_fwd_mask
Group 01:80:C2:00:00:01: not forwarded with flooding, not forwarded
without flooding, not forwarded with mdb and no flooding, not
forwarded with mdb on another port and flooding, can't test with
group_fwd_mask
Group 01:80:C2:00:00:03: not forwarded with flooding, not forwarded
without flooding, not forwarded with mdb and no flooding, not
forwarded with mdb on another port and flooding, forwarded with
group_fwd_mask
Group 01:80:C2:00:00:20: forwarded with flooding, not forwarded
without flooding, forwarded with mdb and no flooding, not forwarded
with mdb on another port and flooding, can't test with group_fwd_mask
Group 01:80:C2:00:00:21: forwarded with flooding, not forwarded
without flooding, forwarded with mdb and no flooding, not forwarded
with mdb on another port and flooding, can't test with group_fwd_mask
Group 01:80:C2:00:00:02: not forwarded with flooding, not forwarded
without flooding, not forwarded with mdb and no flooding, not
forwarded with mdb on another port and flooding, can't test with
group_fwd_mask
Group 01:80:C2:00:00:04: not forwarded with flooding, not forwarded
without flooding, not forwarded with mdb and no flooding, not
forwarded with mdb on another port and flooding, forwarded with
group_fwd_mask
Group 01:80:C2:00:00:0F: not forwarded with flooding, not forwarded
without flooding, not forwarded with mdb and no flooding, not
forwarded with mdb on another port and flooding, forwarded with
group_fwd_mask
Group 01:80:C2:00:00:11: forwarded with flooding, not forwarded
without flooding, forwarded with mdb and no flooding, not forwarded
with mdb on another port and flooding, can't test with group_fwd_mask
Group 01:80:C2:00:00:1F: forwarded with flooding, not forwarded
without flooding, forwarded with mdb and no flooding, not forwarded
with mdb on another port and flooding, forwarded with group_fwd_mask
Group 01:80:C2:00:00:22: forwarded with flooding, not forwarded
without flooding, forwarded with mdb and no flooding, not forwarded
with mdb on another port and flooding, can't test with group_fwd_mask
Group 01:80:C2:00:00:2F: forwarded with flooding, not forwarded
without flooding, forwarded with mdb and no flooding, not forwarded
with mdb on another port and flooding, forwarded with group_fwd_mask

Why did you choose these addresses?

The original complaint I'm trying to address was that LLDP used to be
forwarded on the ksz9477 prior to the enabling of the hw multicast
address table and now is not. LLDP uses both 01-80-c2-00-00-00 and
01-80-c2-00-00-0e and while 01-80-c2-00-00-00 is forwarded currently
on the ksz9477 01-80-c2-00-00-0e is not. It's the same for the
software bridge scenario above - when I add 01-80-c2-00-00-0e to the
test, it's not forwarded. Where are the above rules implemented for
the software bridge and why are these the choices?

Best Regards,

Tim

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ