netdev - Re: [net-next PATCH RFC] net: dsa: qca8k: make learning configurable and keep off if standalone

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Wed, 5 Jul 2023 02:34:56 +0300
From: Vladimir Oltean <olteanv@...il.com>
To: Christian Marangi <ansuelsmth@...il.com>
Cc: Andrew Lunn <andrew@...n.ch>, Florian Fainelli <f.fainelli@...il.com>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Atin Bainada <hi@...nb.me>, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [net-next PATCH RFC] net: dsa: qca8k: make learning configurable
 and keep off if standalone

On Sat, Jul 01, 2023 at 08:25:11PM +0200, Christian Marangi wrote:
> I have bad and good news about this... Bad news is that it seems to be
> qca8k handle flooding in a very interesting and broken way.
> Due to DSA implementation we expect every packet to be flooded to the
> CPU port by default and this is problematic with how the switch works.

To be clear, the network stack understands "flooding" to be something
that takes place for packets that have an unknown MAC DA, or broadcast.
So no, we don't expect that every packet is flooded, nor that all are
flooded to the CPU port. I'm not exactly clear what you mean.

> 
> It seems that enabling flood bit for the CPU results in packet always be
> directed there...

Are you sure? How have you determined this? Can you rephrase this?
Because what I'm understanding is that you're saying that when flooding
towards the CPU is enabled, packets go to the CPU regardless of whether
they're known to the FDB or not. Which can't be correct, because if it
was, then packets sent to a MAC DA statically added as a bridge FDB entry
would be seen twice:

- once, autonomously forwarded by the switch
- twice, due to your claim that "packets are always directed to the CPU",
  (claim which must also hold with the current driver configuration,
  because the driver currently always enables flooding to the CPU)

But if that's the case, it means that the CPU (via the software bridge)
will send a second copy of the packet to the egress port, because
skb->offload_fwd_mark is unset. Someone would have noticed if that was
the case ?!

I have some skepticism towards this claim.

Please try to describe what happens under this scenario:

       br0
     /  |  \
    /   |   \
   /    |    \
 swp0  swp1  swp2
  |     |     |
  A     B     C

# On the switch
ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up
ip link set swp0 master br0 && ip link set br0 up
ip link set swp1 master br0 && ip link set br0 up
ip link set swp2 master br0 && ip link set br0 up
bridge fdb add dev swp1 ${mac_of_B} vlan 1 master static

# On C
tcpdump -i eth0 -e -n

# On A
ping $B

How many ICMP requests do you see on B? How many do you see on C?
How many on the CPU port (br0)?

Correct answers should be:
On B: as many as were sent by A
On C: none
On br0: none on ingress (tcpdump -Q in), none on egress (tcpdump -Q out)

If you're seeing the ICMP requests on station C, see whether they don't
in fact come from br0 having flooded them in software, which means that
the real problem is with the FDB not matching on packets..

I have no idea how you've implemented things, but with reserved VIDs,
you also need to remap VID 0 during port_fdb_add() calls to the reserved
PVID of that bridge, and you've done that, otherwise packets won't match,
right?

> The switch tagger have a way to comunicate that the packet is flooded
> to the cpu.
> 
> With some code in the tagger I manage to reach this state from
> local_termination.sh
> 
> root@...nWrt:~# /ktests/run_kselftest.sh -t drivers/net/dsa:local_termination.sh
> TAP version 13
> 1..1
> # selftests: drivers/net/dsa: local_termination.sh
> # TEST: lan1: Unicast IPv4 to primary MAC address                     [ OK ]
> # TEST: lan1: Unicast IPv4 to macvlan MAC address                     [ OK ]
> # TEST: lan1: Unicast IPv4 to unknown MAC address                     [ OK ]
> # TEST: lan1: Unicast IPv4 to unknown MAC address, promisc            [ OK ]
> # TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti           [ OK ]
> # TEST: lan1: Multicast IPv4 to joined group                          [ OK ]
> # TEST: lan1: Multicast IPv4 to unknown group                         [ OK ]
> # TEST: lan1: Multicast IPv4 to unknown group, promisc                [ OK ]
> # TEST: lan1: Multicast IPv4 to unknown group, allmulti               [ OK ]
> # TEST: lan1: Multicast IPv6 to joined group                          [ OK ]
> # TEST: lan1: Multicast IPv6 to unknown group                         [ OK ]
> # TEST: lan1: Multicast IPv6 to unknown group, promisc                [ OK ]
> # TEST: lan1: Multicast IPv6 to unknown group, allmulti               [ OK ]
> # TEST: br0: Unicast IPv4 to primary MAC address                      [ OK ]
> # TEST: br0: Unicast IPv4 to macvlan MAC address                      [ OK ]
> # TEST: br0: Unicast IPv4 to unknown MAC address                      [FAIL]
> #       reception succeeded, but should have failed
> # TEST: br0: Unicast IPv4 to unknown MAC address, promisc             [ OK ]
> # TEST: br0: Unicast IPv4 to unknown MAC address, allmulti            [FAIL]
> #       reception succeeded, but should have failed
> # TEST: br0: Multicast IPv4 to joined group                           [ OK ]
> # TEST: br0: Multicast IPv4 to unknown group                          [FAIL]
> #       reception succeeded, but should have failed
> # TEST: br0: Multicast IPv4 to unknown group, promisc                 [ OK ]
> # TEST: br0: Multicast IPv4 to unknown group, allmulti                [ OK ]
> # TEST: br0: Multicast IPv6 to joined group                           [ OK ]
> # TEST: br0: Multicast IPv6 to unknown group                          [FAIL]
> #       reception succeeded, but should have failed
> # TEST: br0: Multicast IPv6 to unknown group, promisc                 [ OK ]
> # TEST: br0: Multicast IPv6 to unknown group, allmulti                [ OK ]
> not ok 1 selftests: drivers/net/dsa: local_termination.sh # exit=1
> 
> The tagger check if the packet type is a flood kind and drop the packet
> (return NULL) if the interface is not promisc or allmulti. If the port
> if bridge type this check is skipped...

Why is the check skipped if the port is under a bridge? Because there
might be foreign interfaces under that bridge (like wlan0) which might
be valid destinations for flooded traffic (through software forwarding)?

> My concern is... Is it ok to implement host_set_flood this way? I tried
> many variant and I wasn't able to make flood work by using those bits...
> 
> With the current implementation, host_set_flood will return always zero
> and everything will be handled in the tagger by dropping packet but I'm
> not sure it's ok to do it.

Well, we should first clarify exactly what it is that you claim to be a
hardware bug.