[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFZh4h-3AaoQwJcaQoYc_e=yrR7a6d7Qr77R8o56mtbFye_0cw@mail.gmail.com>
Date: Thu, 28 Jul 2022 15:14:17 -0400
From: Brian Hutchinson <b.hutchman@...il.com>
To: Vladimir Oltean <olteanv@...il.com>
Cc: Florian Fainelli <f.fainelli@...il.com>, netdev@...r.kernel.org,
andrew@...n.ch, woojung.huh@...rochip.com,
UNGLinuxDriver@...rochip.com, j.vosburgh@...il.com,
vfalico@...il.com, andy@...yhouse.net, davem@...emloft.net,
kuba@...nel.org
Subject: Re: Bonded multicast traffic causing packet loss when using DSA with
Microchip KSZ9567 switch
Hello netdev,
On Thu, Jul 28, 2022 at 10:45 AM Brian Hutchinson <b.hutchman@...il.com> wrote:
>
> Hi Vladimir,
>
> On Wed, Jul 27, 2022 at 7:32 PM Vladimir Oltean <olteanv@...il.com> wrote:
>
> > > bond1: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500 metric 1
> > > inet 192.168.1.6 netmask 255.255.255.0 broadcast 0.0.0.0
> > > inet6 fd1c:a799:6054:0:60e2:5ff:fe75:6716 prefixlen 64 scopeid 0x0<global>
> > > inet6 fe80::60e2:5ff:fe75:6716 prefixlen 64 scopeid 0x20<link>
> > > ether 62:e2:05:75:67:16 txqueuelen 1000 (Ethernet)
> >
> > I see bond1, lan1 and lan2 all have the same MAC address (62:e2:05:75:67:16).
> > Does this happen even when they are all different?
>
> So I have (when bond is setup using Systemd) assigned unique MAC
> addresses for eth0, lan1 and lan2 ... but default action of bonding is
> to assign the bond (bond1) and the slaves (lan1, lan2) a MAC that is
> all the same among all the interfaces. There are settings (controlled
> by fail_over_mac) to specify which MAC is chosen to seed the MAC of
> the other interfaces but bottom line is bonding makes both the bond
> and active slave at a minimum the same MAC.
>
> >
> > > RX packets 2557 bytes 3317974 (3.1 MiB)
> > > RX errors 0 dropped 2 overruns 0 frame 0
> > > TX packets 2370 bytes 3338160 (3.1 MiB)
> > > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
> > >
> > > eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1506 metric 1
> > > inet6 fe80::f21f:afff:fe6b:b218 prefixlen 64 scopeid 0x20<link>
> > > ether f0:1f:af:6b:b2:18 txqueuelen 1000 (Ethernet)
> > > RX packets 2557 bytes 3371671 (3.2 MiB)
> > > RX errors 0 dropped 0 overruns 0 frame 0
> > > TX packets 2394 bytes 3345891 (3.1 MiB)
> > > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
> > >
> > > lan1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 metric 1
> > > ether 62:e2:05:75:67:16 txqueuelen 1000 (Ethernet)
> > > RX packets 248 bytes 19384 (18.9 KiB)
> > > RX errors 0 dropped 0 overruns 0 frame 0
> > > TX packets 2370 bytes 3338160 (3.1 MiB)
> > > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
> > >
> > > lan2: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 metric 1
> > > ether 62:e2:05:75:67:16 txqueuelen 1000 (Ethernet)
> > > RX packets 2309 bytes 3298590 (3.1 MiB)
> > > RX errors 0 dropped 1 overruns 0 frame 0
> >
> > I find this extremely strange. AFAIK, ifconfig reads stats from /proc/net/dev,
> > which in turn takes them from the driver using dev_get_stats():
> > https://elixir.bootlin.com/linux/v5.10.69/source/net/core/net-procfs.c#L78
> >
> > But DSA didn't even report the "dropped" count via ndo_get_stats64 in 5.10...
> > https://elixir.bootlin.com/linux/v5.10.69/source/net/dsa/slave.c#L1257
> >
> > I have no idea why this shows 1. I'll have to ignore this information
> > for now.
> >
> .
> .
> .
> >
> > Would you mind trying to test the exact same scenario again with the
> > patch attached? (also pasted below in plain text) Still the same MAC
> > address for all interfaces for now.
>
> No problem at all. I'm stumped too and welcome ideas to figure out
> what is going on.
>
> >
> > From 033e3a8650a498de73cd202375b2e3f843e9a376 Mon Sep 17 00:00:00 2001
> > From: Vladimir Oltean <vladimir.oltean@....com>
> > Date: Thu, 28 Jul 2022 02:07:08 +0300
> > Subject: [PATCH] ksz9477: force-disable address learning
> >
> > I suspect that what Brian Hutchinson experiences with the rx_discards
> > counter incrementing is due to his setup, where 2 external switches
> > connect together 2 bonded KSZ9567 switch ports, in such a way that one
> > KSZ port is able to see packets sent by the other (this is probably
> > aggravated by the multicast sent at a high data rate, which is treated
> > as broadcast by the external switches and flooded).
>
> So I mentioned in a recent PM that I was looking at other vendor DSA
> drivers and I see code that smells like some of the concerns you have.
>
> I did some grepping on /drivers/net/dsa and while I get hits for
> things like 'flood', 'multicast', 'igmp' etc. in marvel and broadcom
> drivers ... I get nothing on microchip. Hardware documentation has
> whole section on ingress and egress rate limiting and shaping but
> doesn't look like drivers use any of it.
>
> Example:
>
> /drivers/net/dsa/mv88e6xxx$ grep -i multicast *.c
> chip.c: { "in_multicasts", 4, 0x07, STATS_TYPE_BANK0, },
> chip.c: { "out_multicasts", 4, 0x12, STATS_TYPE_BANK0, },
> chip.c: is_multicast_ether_addr(addr))
> chip.c: /* Upstream ports flood frames with unknown unicast or multicast DA */
> chip.c: * forwarding of unknown unicasts and multicasts.
> chip.c: dev_err(ds->dev, "p%d: failed to load multicast MAC address\n",
> chip.c: bool unicast, bool multicast)
> chip.c: multicast);
> global2.c: /* Consider the frames with reserved multicast destination
> global2.c: /* Consider the frames with reserved multicast destination
> port.c: bool unicast, bool multicast)
> port.c: if (unicast && multicast)
> port.c: else if (multicast)
> port.c: int port, bool multicast)
> port.c: if (multicast)
> port.c: bool unicast, bool multicast)
> port.c: return mv88e6185_port_set_default_for
> ward(chip, port, multicast);
>
> Wondering if some needed support is missing.
>
> Will try your patch and report back.
I applied Vladimir's patch (had to edit it to change ksz9477.c to
ksz9477_main.c) ;)
I did the same steps as before but ran multicast iperf a bit longer as
I wasn't noticing packet loss this time. I also fat fingered options
on first iperf run so if you focus on the number of datagrams iperf
sent below, the RX counts won't match that.
On PC ran: iperf -s -u -B 239.0.0.67%enp4s0 -i 1
On my board I ran: iperf -B 192.168.1.6 -c 239.0.0.67 -u --ttl 5 -t
3600 -b 1M -i 1 (I noticed I had a copy/paste error in previous email
... no I didn't use a -ttl of 3000!!!). Again I didn't let iperf run
for 3600 sec., ctrl-c it early.
Pings from external PC to board while iperf multicast test was going
on resulted in zero dropped packets.
.
.
.
64 bytes from 192.168.1.6: icmp_seq=98 ttl=64 time=1.94 ms
64 bytes from 192.168.1.6: icmp_seq=99 ttl=64 time=1.91 ms
64 bytes from 192.168.1.6: icmp_seq=100 ttl=64 time=0.713 ms
64 bytes from 192.168.1.6: icmp_seq=101 ttl=64 time=1.95 ms
64 bytes from 192.168.1.6: icmp_seq=102 ttl=64 time=1.26 ms
^C
--- 192.168.1.6 ping statistics ---
102 packets transmitted, 102 received, 0% packet loss, time 101265ms
rtt min/avg/max/mdev = 0.253/1.451/2.372/0.414 ms
... I also noticed that the board's ping time greatly improved too.
I've noticed ping times are usually over 2ms and I'm not sure why or
what to do about it.
iperf on board sent 9901 datagrams:
.
.
.
[ 3] 108.0-109.0 sec 128 KBytes 1.05 Mbits/sec
[ 3] 109.0-110.0 sec 129 KBytes 1.06 Mbits/sec
[ 3] 110.0-111.0 sec 128 KBytes 1.05 Mbits/sec
^C[ 3] 0.0-111.0 sec 13.9 MBytes 1.05 Mbits/sec
[ 3] Sent 9901 datagrams
ethtool statistics:
ethtool -S eth0 | grep -v ': 0'
NIC statistics:
tx_packets: 32713
tx_broadcast: 2
tx_multicast: 32041
tx_65to127byte: 719
tx_128to255byte: 30
tx_1024to2047byte: 31964
tx_octets: 48598874
IEEE_tx_frame_ok: 32713
IEEE_tx_octets_ok: 48598874
rx_packets: 33260
rx_broadcast: 378
rx_multicast: 32209
rx_65to127byte: 1140
rx_128to255byte: 136
rx_256to511byte: 20
rx_1024to2047byte: 31964
rx_octets: 48624055
IEEE_rx_frame_ok: 33260
IEEE_rx_octets_ok: 48624055
p06_rx_bcast: 2
p06_rx_mcast: 32041
p06_rx_ucast: 670
p06_rx_65_127: 719
p06_rx_128_255: 30
p06_rx_1024_1522: 31964
p06_tx_bcast: 378
p06_tx_mcast: 32209
p06_tx_ucast: 673
p06_rx_total: 48598874
p06_tx_total: 48624055
# ethtool -S lan1 | grep -v ': 0'
NIC statistics:
tx_packets: 32711
tx_bytes: 48401459
rx_packets: 1011
rx_bytes: 84159
rx_bcast: 207
rx_mcast: 111
rx_ucast: 697
rx_64_or_less: 234
rx_65_127: 699
rx_128_255: 70
rx_256_511: 12
tx_bcast: 2
tx_mcast: 32015
tx_ucast: 694
rx_total: 103241
tx_total: 48532849
rx_discards: 4
# ethtool -S lan2 | grep -v ': 0'
NIC statistics:
rx_packets: 32325
rx_bytes: 47915110
rx_bcast: 209
rx_mcast: 32120
rx_64_or_less: 212
rx_65_127: 55
rx_128_255: 86
rx_256_511: 12
rx_1024_1522: 31964
rx_total: 48497844
rx_discards: 4
ifconfig stats: (2 dropped packets on lan2. Last time lan1 and lan2
about roughly same RX counts, this time lan1 significantly less)
# ifconfig
bond1: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500 metric 1
inet 192.168.1.6 netmask 255.255.255.0 broadcast 0.0.0.0
inet6 fd1c:a799:6054:0:60e2:5ff:fe75:6716 prefixlen 64
scopeid 0x0<global>
inet6 fe80::60e2:5ff:fe75:6716 prefixlen 64 scopeid 0x20<link>
ether 62:e2:05:75:67:16 txqueuelen 1000 (Ethernet)
RX packets 33392 bytes 48003505 (45.7 MiB)
RX errors 0 dropped 4 overruns 0 frame 0
TX packets 32723 bytes 48402583 (46.1 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1506 metric 1
inet6 fe80::f21f:afff:fe6b:b218 prefixlen 64 scopeid 0x20<link>
ether f0:1f:af:6b:b2:18 txqueuelen 1000 (Ethernet)
RX packets 33392 bytes 48704737 (46.4 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 32749 bytes 48471466 (46.2 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lan1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 metric 1
ether 62:e2:05:75:67:16 txqueuelen 1000 (Ethernet)
RX packets 1045 bytes 86755 (84.7 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 32723 bytes 48402583 (46.1 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lan2: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 metric 1
ether 62:e2:05:75:67:16 txqueuelen 1000 (Ethernet)
RX packets 32347 bytes 47916750 (45.6 MiB)
RX errors 0 dropped 2 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 metric 1
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
# cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v5.10.69-g472c99a84cb6-dirty
Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: lan1
MII Status: up
MII Polling Interval (ms): 1000
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0
Slave Interface: lan1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: f0:1f:af:6b:b2:18
Slave queue ID: 0
Slave Interface: lan2
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: f0:1f:af:6b:b2:18
Slave queue ID: 0
*Note: I did unplug lan2 interface before I ran test which is why
lan2 Link Failure Count is 1.
Regards,
Brian
Powered by blists - more mailing lists