[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fd16ebb3-2435-ef01-d9f1-b873c9c0b389@gmail.com>
Date: Mon, 25 Jul 2022 14:35:40 -0700
From: Florian Fainelli <f.fainelli@...il.com>
To: Brian Hutchinson <b.hutchman@...il.com>, netdev@...r.kernel.org
Cc: andrew@...n.ch, Vladimir Oltean <olteanv@...il.com>,
woojung.huh@...rochip.com, UNGLinuxDriver@...rochip.com,
j.vosburgh@...il.com, vfalico@...il.com, andy@...yhouse.net,
davem@...emloft.net, kuba@...nel.org
Subject: Re: Bonded multicast traffic causing packet loss when using DSA with
Microchip KSZ9567 switch
On 7/25/22 08:12, Brian Hutchinson wrote:
> I'm experiencing large packet loss when using multicast with bonded
> DSA interfaces.
>
> I have the first two ports of a ksz9567 setup as individual network
> interfaces in device tree that shows up in the system as lan1 and lan2
> and I have those two interfaces bonded in an "active-backup" bond with
> the intent of having each slave interface go to redundant switches.
> I've tried connecting both interfaces to the same switch and also to
> separate switches that are then connected together. In the latter
> setup, if I disconnect the two switches I don't see the problem.
>
> The kernel bonding documentation says "active-backup" will work with
> any layer2 switch and doesn't need smart/managed switches configured
> in any particular way. I'm currently using dumb switches.
>
> I can readily reproduce the packet loss issue running iperf to
> generate multicast traffic.
>
> If I ping my board with the ksz9567 from a PC while iperf is
> generating multicast packets, I get tons of packet loss. If I run
> heavily loaded iperf tests that are not multicast I don't notice the
> packet loss problem.
>
> Here is ifconfig view of interfaces:
>
> bond1: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500 metric 1
> inet 192.168.1.6 netmask 255.255.255.0 broadcast 0.0.0.0
> inet6 fd1c:a799:6054:0:60e2:5ff:fe75:6716 prefixlen 64
> scopeid 0x0<global>
> inet6 fe80::60e2:5ff:fe75:6716 prefixlen 64 scopeid 0x20<link>
> ether 62:e2:05:75:67:16 txqueuelen 1000 (Ethernet)
> RX packets 1264782 bytes 84198600 (80.2 MiB)
> RX errors 0 dropped 40 overruns 0 frame 0
> TX packets 2466062 bytes 3705565532 (3.4 GiB)
> TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>
> eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1506 metric 1
> inet6 fe80::f21f:afff:fe6b:b218 prefixlen 64 scopeid 0x20<link>
> ether f0:1f:af:6b:b2:18 txqueuelen 1000 (Ethernet)
> RX packets 1264782 bytes 110759022 (105.6 MiB)
> RX errors 0 dropped 0 overruns 0 frame 0
> TX packets 2466097 bytes 3710503019 (3.4 GiB)
> TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>
> lan1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 metric 1
> ether 62:e2:05:75:67:16 txqueuelen 1000 (Ethernet)
> RX packets 543771 bytes 37195218 (35.4 MiB)
> RX errors 0 dropped 20 overruns 0 frame 0
> TX packets 1058336 bytes 1593030865 (1.4 GiB)
> TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>
> lan2: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 metric 1
> ether 62:e2:05:75:67:16 txqueuelen 1000 (Ethernet)
> RX packets 721011 bytes 47003382 (44.8 MiB)
> RX errors 0 dropped 0 overruns 0 frame 0
> TX packets 1407726 bytes 2112534667 (1.9 GiB)
> TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>
> lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 metric 1
> inet 127.0.0.1 netmask 255.0.0.0
> inet6 ::1 prefixlen 128 scopeid 0x10<host>
> loop txqueuelen 1000 (Local Loopback)
> RX packets 394 bytes 52052 (50.8 KiB)
> RX errors 0 dropped 0 overruns 0 frame 0
> TX packets 394 bytes 52052 (50.8 KiB)
> TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>
> Is what I'm trying to do even valid with dumb switches or is the
> bonding documentation wrong/outdated regarding active-backup bonds not
> needing smart switches?
>
> I know there's probably not going to be anyone out there that can
> reproduce my setup to look at this problem but I'm willing to run
> whatever tests and provide all the info/feedback I can.
>
> I'm running 5.10.69 on iMX8MM with custom Linux OS based on Yocto
> Dunfell release.
>
> I know that DSA master interface eth0 is not to be accessed directly
> yet I see eth0 is getting an ipv6 address and I'm wondering if that
> could cause a scenario where networking stack could attempt to use
> eth0 directly for traffic.
This is a red herring, we cannot tell the network stack without much special casing that the DSA network device must only transport tagged traffic to/from the switch, so the IPv6 stack still happily generates a link local address for your adapter.
Any chance of getting the outputs of ethtool -S for lan1 and lan2, and eth0 so we could possibly glean something from the hardware maintained statistics?
--
Florian
Powered by blists - more mailing lists