linux-kernel - Re: [net-next PATCH RFC] net: dsa: qca8k: make learning configurable and keep off if standalone

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <64a1caff.050a0220.561f4.7316@mx.google.com>
Date:   Sat, 1 Jul 2023 20:25:11 +0200
From:   Christian Marangi <ansuelsmth@...il.com>
To:     Vladimir Oltean <olteanv@...il.com>
Cc:     Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>, Atin Bainada <hi@...nb.me>,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [net-next PATCH RFC] net: dsa: qca8k: make learning configurable
 and keep off if standalone

On Thu, Jun 29, 2023 at 03:39:54PM +0300, Vladimir Oltean wrote:
> On Wed, Jun 28, 2023 at 02:49:53AM +0200, Christian Marangi wrote:
> > Hi, I wasted a day to only notice that the problem was in busybox not
> > supporting 0.x value and that is what selfttest script use. Another
> > thing to check. Also the confusing part of this is that we don't check
> > if ping_do return error and the test just fails (while in reality the
> > ping command was never executed)
> 
> Interesting. Not sure how you'll check for the specific busybox
> implementation. Does it help if you add "check_err $?" after ping_do()
> in send_uc_ipv4(), like ping_test() has?
>

I think it's better to implement some simple tests before the test is
done. Similar to tools missing tests...

So something like "ping -i 0.1" and reading the output if it does match
a specific error. Or at least add some comments that that option is
required. But I guess check_err should be added anyway since it'd the
current pattern used in lib.sh

> > Anyway I'm fixing all kind of bugs and I even found an even hw
> > limitation with the FDB table with mgmt packet...
> > 
> > Also I implemented fdb_insolation following the implementation done on
> > felix with reserving VID at the end...
> > About this I wonder if it might be a good idea to expose some generic
> > API from DSA?
> > 
> > qca8k require a reserved VID for VLAN unaware port and with
> > fdb_isolation even more VID are reserved. DSA have no idea about this so
> > I wonder if there is a chance of VID clash? I feel like we need
> > something to declare a range of reserved VID so that they gets rejected
> > when applied. (about this I think I should return -EINVAL if fdb/mbd
> > insert are asked with a reserved VID)
> 
> I don't think that it's possible to get a port_fdb_add() or port_mdb_add()
> call for a VID that wasn't accepted through port_vlan_add() first.
> At least I don't see how. The VLAN-aware FDB entries come from the
> bridge through SWITCHDEV_OBJ_ID_PORT_VLAN and from DSA's private
> .ndo_vlan_rx_add_vid() / .ndo_vlan_rx_kill_vid() implementations for
> VLAN filters (those that land in dp->user_vlans; refresh your net.git
> tree if you don't find those).
> 

Ok so in theory it should be reserved and a check should be added in
port_valn_add to return an error if a reserved VID is used.

> If we reject VLANs at those layers, then:
> # through the bridge
> [root@mox:~] # bridge fdb add 00:01:02:03:04:05 vlan 2000 dev lan21 master static
> [ 7611.225227] bridge: RTM_NEWNEIGH with unconfigured vlan 2000 on lan21
> RTNETLINK answers: Invalid argument
> # bridge bypass, should go through ndo_fdb_add() if DSA had one
> [root@mox:~] # bridge fdb add 00:01:02:03:04:05 vlan 2000 dev lan21 self static
> [ 7855.532615] mv88e6085 d0032004.mdio-mii:12 lan21: default FDB implementation only supports local addresses
> RTNETLINK answers: Invalid argument
> 
> So, while we could probably add some API in core DSA for a reserved VID
> range, I'm not sure that it will be that useful.
> 
> And regarding whether there is a chance of VID class? I guess there is.
> I have a board where the felix driver (reserved range 4000-4095) is a
> DSA master for the sja1105 driver (reserved range 3072-4095 for tag_8021q).
> Since any dsa_tag_8021q_port_setup() call does a vlan_vid_add() call
> towards the master, then there's a chance that felix could reject the
> tag_8021q setup of the sja1105 ports for tag_8021q VIDs 4000 and upwards.
> 
> VID 4000 = 0xfa0 = bitwise 0b1111_1010_0000
> 
> Comparing with net/dsa/tag_8021q.c:
> 
>  * | 11  | 10  |  9  |  8  |  7  |  6  |  5  |  4  |  3  |  2  |  1  |  0  |
>  * +-----------+-----+-----------------+-----------+-----------------------+
>  * |    RSV    | VBID|    SWITCH_ID    |   VBID    |          PORT         |
>  * +-----------+-----+-----------------+-----------+-----------------------+
> 
> So it would be problematic for VBID >= 6, whenever SWITCH_ID >= 6.
> However, practical applications of tag_8021q with the sja1105 boards
> that I'm aware of do not have 7 switches, so it's a theoretical problem
> only.
> 
> Though we would need to be rather careful when calculating and enforcing
> the reserved ranges in the DSA core, to not cause false positive errors.
> 

Was just an idea since I see only felix using fdb_isolation and it seems
reserving VID is the current way to handle it.

> > 
> > Anyway by fixing that interval problem (enabling support in busybox as
> > it's disabled by default in a OpenWRT system)
> > This is the new output of the local_termination.sh 
> > 
> > TAP version 13
> > 1..1
> > # selftests: drivers/net/dsa: local_termination.sh
> > # TEST: lan1: Unicast IPv4 to primary MAC address                     [ OK ]
> > # TEST: lan1: Unicast IPv4 to macvlan MAC address                     [ OK ]
> > # TEST: lan1: Unicast IPv4 to unknown MAC address                     [FAIL]
> > #       reception succeeded, but should have failed
> > # TEST: lan1: Unicast IPv4 to unknown MAC address, promisc            [ OK ]
> > # TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti           [FAIL]
> > #       reception succeeded, but should have failed
> > # TEST: lan1: Multicast IPv4 to joined group                          [ OK ]
> > # TEST: lan1: Multicast IPv4 to unknown group                         [FAIL]
> > #       reception succeeded, but should have failed
> > # TEST: lan1: Multicast IPv4 to unknown group, promisc                [ OK ]
> > # TEST: lan1: Multicast IPv4 to unknown group, allmulti               [ OK ]
> > # TEST: lan1: Multicast IPv6 to joined group                          [ OK ]
> > # TEST: lan1: Multicast IPv6 to unknown group                         [FAIL]
> > #       reception succeeded, but should have failed
> > # TEST: lan1: Multicast IPv6 to unknown group, promisc                [ OK ]
> > # TEST: lan1: Multicast IPv6 to unknown group, allmulti               [ OK ]
> > # TEST: br0: Unicast IPv4 to primary MAC address                      [ OK ]
> > # TEST: br0: Unicast IPv4 to macvlan MAC address                      [ OK ]
> > # TEST: br0: Unicast IPv4 to unknown MAC address                      [FAIL]
> > #       reception succeeded, but should have failed
> > # TEST: br0: Unicast IPv4 to unknown MAC address, promisc             [ OK ]
> > # TEST: br0: Unicast IPv4 to unknown MAC address, allmulti            [FAIL]
> > #       reception succeeded, but should have failed
> > # TEST: br0: Multicast IPv4 to joined group                           [ OK ]
> > # TEST: br0: Multicast IPv4 to unknown group                          [FAIL]
> > #       reception succeeded, but should have failed
> > # TEST: br0: Multicast IPv4 to unknown group, promisc                 [ OK ]
> > # TEST: br0: Multicast IPv4 to unknown group, allmulti                [ OK ]
> > # TEST: br0: Multicast IPv6 to joined group                           [ OK ]
> > # TEST: br0: Multicast IPv6 to unknown group                          [FAIL]
> > #       reception succeeded, but should have failed
> > # TEST: br0: Multicast IPv6 to unknown group, promisc                 [ OK ]
> > # TEST: br0: Multicast IPv6 to unknown group, allmulti                [ OK ]
> > not ok 1 selftests: drivers/net/dsa: local_termination.sh # exit=1
> > 
> > Seems good aside from the reception secceeded that I think the kernel
> > just drops right?
> 
> Looks good. If you've implemented FDB isolation, then it means that you
> get port_fdb_add() and port_mdb_add() calls on the CPU port(s), and you
> can now also implement .port_set_host_flood() and that should also make
> the following tests pass:
> 
> * TEST: lan1: Unicast IPv4 to unknown MAC address                     [FAIL]
> * TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti           [FAIL]
> * TEST: lan1: Multicast IPv4 to unknown group                         [FAIL]
> * TEST: lan1: Multicast IPv6 to unknown group                         [FAIL]
> 
> The bridge tests to an unknown address will always fail with the message
> "reception succeeded, but should have failed", so it's not you, it's the
> test there.
> 
> Once you're there, you can re-do these tests with:
> - all user ports to CPU port 0
> - all user ports to CPU port 6
> - user ports statically split between CPU ports 0 and 6
> - all user ports to LAG composed of CPU ports 0 and 6
> 

I have bad and good news about this... Bad news is that it seems to be
qca8k handle flooding in a very interesting and broken way.
Due to DSA implementation we expect every packet to be flooded to the
CPU port by default and this is problematic with how the switch works.

It seems that enabling flood bit for the CPU results in packet always be
directed there... The switch tagger have a way to comunicate that the
packet is flooded to the cpu.

With some code in the tagger I manage to reach this state from
local_termination.sh

root@...nWrt:~# /ktests/run_kselftest.sh -t drivers/net/dsa:local_termination.sh
TAP version 13
1..1
# selftests: drivers/net/dsa: local_termination.sh
# TEST: lan1: Unicast IPv4 to primary MAC address                     [ OK ]
# TEST: lan1: Unicast IPv4 to macvlan MAC address                     [ OK ]
# TEST: lan1: Unicast IPv4 to unknown MAC address                     [ OK ]
# TEST: lan1: Unicast IPv4 to unknown MAC address, promisc            [ OK ]
# TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti           [ OK ]
# TEST: lan1: Multicast IPv4 to joined group                          [ OK ]
# TEST: lan1: Multicast IPv4 to unknown group                         [ OK ]
# TEST: lan1: Multicast IPv4 to unknown group, promisc                [ OK ]
# TEST: lan1: Multicast IPv4 to unknown group, allmulti               [ OK ]
# TEST: lan1: Multicast IPv6 to joined group                          [ OK ]
# TEST: lan1: Multicast IPv6 to unknown group                         [ OK ]
# TEST: lan1: Multicast IPv6 to unknown group, promisc                [ OK ]
# TEST: lan1: Multicast IPv6 to unknown group, allmulti               [ OK ]
# TEST: br0: Unicast IPv4 to primary MAC address                      [ OK ]
# TEST: br0: Unicast IPv4 to macvlan MAC address                      [ OK ]
# TEST: br0: Unicast IPv4 to unknown MAC address                      [FAIL]
#       reception succeeded, but should have failed
# TEST: br0: Unicast IPv4 to unknown MAC address, promisc             [ OK ]
# TEST: br0: Unicast IPv4 to unknown MAC address, allmulti            [FAIL]
#       reception succeeded, but should have failed
# TEST: br0: Multicast IPv4 to joined group                           [ OK ]
# TEST: br0: Multicast IPv4 to unknown group                          [FAIL]
#       reception succeeded, but should have failed
# TEST: br0: Multicast IPv4 to unknown group, promisc                 [ OK ]
# TEST: br0: Multicast IPv4 to unknown group, allmulti                [ OK ]
# TEST: br0: Multicast IPv6 to joined group                           [ OK ]
# TEST: br0: Multicast IPv6 to unknown group                          [FAIL]
#       reception succeeded, but should have failed
# TEST: br0: Multicast IPv6 to unknown group, promisc                 [ OK ]
# TEST: br0: Multicast IPv6 to unknown group, allmulti                [ OK ]
not ok 1 selftests: drivers/net/dsa: local_termination.sh # exit=1

The tagger check if the packet type is a flood kind and drop the packet
(return NULL) if the interface is not promisc or allmulti. If the port
if bridge type this check is skipped...

My concern is... Is it ok to implement host_set_flood this way? I tried
many variant and I wasn't able to make flood work by using those bits...

With the current implementation, host_set_flood will return always zero
and everything will be handled in the tagger by dropping packet but I'm
not sure it's ok to do it.

> > The switch have a way to control FLOOD per port but maybe the correct
> > feature to use is the VLAN leaky? Where the setting can be set by both
> > FDB/MBD and per port.
> 
> I'm not sure why would the leaky VLAN feature be useful ("enable
> specific frames to be forwarded across VLAN boundary"). I really don't
> know what you're thinking about.
> 

Yes sorry, just me wasting way too much time on this and searching for
alternative solution... Switch documentation doesn't describe these case
very well sadly...

> > Sorry if I described some fix and implementation without proposing the
> > patch but I would like some comments on what the tool returned.
> > 

-- 
	Ansuel