lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220728222927.rc7yfkwinqoiec3w@skbuf>
Date:   Fri, 29 Jul 2022 01:29:27 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     Brian Hutchinson <b.hutchman@...il.com>
Cc:     Florian Fainelli <f.fainelli@...il.com>, netdev@...r.kernel.org,
        andrew@...n.ch, woojung.huh@...rochip.com,
        UNGLinuxDriver@...rochip.com, j.vosburgh@...il.com,
        vfalico@...il.com, andy@...yhouse.net, davem@...emloft.net,
        kuba@...nel.org
Subject: Re: Bonded multicast traffic causing packet loss when using DSA with
 Microchip KSZ9567 switch

Hi Brian,

On Thu, Jul 28, 2022 at 03:14:17PM -0400, Brian Hutchinson wrote:
> > So I mentioned in a recent PM that I was looking at other vendor DSA
> > drivers and I see code that smells like some of the concerns you have.
> >
> > I did some grepping on /drivers/net/dsa and while I get hits for
> > things like 'flood', 'multicast', 'igmp' etc. in marvel and broadcom
> > drivers ... I get nothing on microchip.  Hardware documentation has
> > whole section on ingress and egress rate limiting and shaping but
> > doesn't look like drivers use any of it.
> >
> > Example:
> >
> > /drivers/net/dsa/mv88e6xxx$ grep -i multicast *.c
> > chip.c: { "in_multicasts",              4, 0x07, STATS_TYPE_BANK0, },
> > chip.c: { "out_multicasts",             4, 0x12, STATS_TYPE_BANK0, },
> > chip.c:                  is_multicast_ether_addr(addr))
> > chip.c: /* Upstream ports flood frames with unknown unicast or multicast DA */
> > chip.c:  * forwarding of unknown unicasts and multicasts.
> > chip.c:         dev_err(ds->dev, "p%d: failed to load multicast MAC address\n",
> > chip.c:                                  bool unicast, bool multicast)
> > chip.c:                                                       multicast);
> > global2.c:      /* Consider the frames with reserved multicast destination
> > global2.c:      /* Consider the frames with reserved multicast destination
> > port.c:                              bool unicast, bool multicast)
> > port.c: if (unicast && multicast)
> > port.c: else if (multicast)
> > port.c:                                       int port, bool multicast)
> > port.c: if (multicast)
> > port.c:                              bool unicast, bool multicast)
> > port.c: return mv88e6185_port_set_default_for
> > ward(chip, port, multicast);
> >
> > Wondering if some needed support is missing.

I know it's tempting to look at other drivers and think "whoah, how much
code these guys have! and I went for the cheaper switch!", but here it
really does not matter in the slightest.

Your application, as far as I understand it, requires the KSZ switch to
operate as a simple port multiplexer, with no hardware offloading of
packet processing (essentially all ports operate as what we call
'standalone'). It's quite sad that this mode didn't work with the KSZ
driver. But what you're looking at, 'multicast', 'igmp', things like
that, only matter if you instruct the switch to forward packets in
hardware, trap packets for control protocols, things like that.
Not applicable.

> > Will try your patch and report back.
> 
> I applied Vladimir's patch (had to edit it to change ksz9477.c to
> ksz9477_main.c) ;)
> 
> I did the same steps as before but ran multicast iperf a bit longer as
> I wasn't noticing packet loss this time.  I also fat fingered options
> on first iperf run so if you focus on the number of datagrams iperf
> sent below, the RX counts won't match that.
> 
> On PC ran: iperf -s -u -B 239.0.0.67%enp4s0 -i 1
> On my board I ran: iperf -B 192.168.1.6 -c 239.0.0.67 -u --ttl 5 -t
> 3600 -b 1M -i 1 (I noticed I had a copy/paste error in previous email
> ... no I didn't use a -ttl of 3000!!!).  Again I didn't let iperf run
> for 3600 sec., ctrl-c it early.
> 
> Pings from external PC to board while iperf multicast test was going
> on resulted in zero dropped packets.

Can you please reword this so that I can understand beyond any doubt
that you're saying that the patch has fixed the problem?

> .
> .
> .
> 64 bytes from 192.168.1.6: icmp_seq=98 ttl=64 time=1.94 ms
> 64 bytes from 192.168.1.6: icmp_seq=99 ttl=64 time=1.91 ms
> 64 bytes from 192.168.1.6: icmp_seq=100 ttl=64 time=0.713 ms
> 64 bytes from 192.168.1.6: icmp_seq=101 ttl=64 time=1.95 ms
> 64 bytes from 192.168.1.6: icmp_seq=102 ttl=64 time=1.26 ms
> ^C
> --- 192.168.1.6 ping statistics ---
> 102 packets transmitted, 102 received, 0% packet loss, time 101265ms
> rtt min/avg/max/mdev = 0.253/1.451/2.372/0.414 ms
> 
> ... I also noticed that the board's ping time greatly improved too.
> I've noticed ping times are usually over 2ms and I'm not sure why or
> what to do about it.

So they're usually over 2 ms now, or were before? I see 1.95 ms, that's
not too far.

I think "rteval" / "cyclictest" / "perf" are the kind of tools you need
to look at, if you want to improve this RTT.

> iperf on board sent 9901 datagrams:
> 
> .
> .
> .
> [  3] 108.0-109.0 sec   128 KBytes  1.05 Mbits/sec
> [  3] 109.0-110.0 sec   129 KBytes  1.06 Mbits/sec
> [  3] 110.0-111.0 sec   128 KBytes  1.05 Mbits/sec
> ^C[  3]  0.0-111.0 sec  13.9 MBytes  1.05 Mbits/sec
> [  3] Sent 9901 datagrams
> 
> ethtool statistics:
> 
> ethtool -S eth0 | grep -v ': 0'
> NIC statistics:
>     tx_packets: 32713
>     tx_broadcast: 2
>     tx_multicast: 32041
>     tx_65to127byte: 719
>     tx_128to255byte: 30
>     tx_1024to2047byte: 31964
>     tx_octets: 48598874
>     IEEE_tx_frame_ok: 32713
>     IEEE_tx_octets_ok: 48598874
>     rx_packets: 33260
>     rx_broadcast: 378
>     rx_multicast: 32209
>     rx_65to127byte: 1140
>     rx_128to255byte: 136
>     rx_256to511byte: 20
>     rx_1024to2047byte: 31964
>     rx_octets: 48624055
>     IEEE_rx_frame_ok: 33260
>     IEEE_rx_octets_ok: 48624055
>     p06_rx_bcast: 2
>     p06_rx_mcast: 32041
>     p06_rx_ucast: 670
>     p06_rx_65_127: 719
>     p06_rx_128_255: 30
>     p06_rx_1024_1522: 31964
>     p06_tx_bcast: 378
>     p06_tx_mcast: 32209
>     p06_tx_ucast: 673
>     p06_rx_total: 48598874
>     p06_tx_total: 48624055

(unrelated: the octet counts reported by the FEC match those of the KSZ switch; I'm impressed)

> # ethtool -S lan1 | grep -v ': 0'
> NIC statistics:
>     tx_packets: 32711
>     tx_bytes: 48401459
>     rx_packets: 1011
>     rx_bytes: 84159
>     rx_bcast: 207
>     rx_mcast: 111
>     rx_ucast: 697
>     rx_64_or_less: 234
>     rx_65_127: 699
>     rx_128_255: 70
>     rx_256_511: 12
>     tx_bcast: 2
>     tx_mcast: 32015
>     tx_ucast: 694
>     rx_total: 103241
>     tx_total: 48532849
>     rx_discards: 4
> 
> # ethtool -S lan2 | grep -v ': 0'
> NIC statistics:
>     rx_packets: 32325
>     rx_bytes: 47915110
>     rx_bcast: 209
>     rx_mcast: 32120
>     rx_64_or_less: 212
>     rx_65_127: 55
>     rx_128_255: 86
>     rx_256_511: 12
>     rx_1024_1522: 31964
>     rx_total: 48497844
>     rx_discards: 4

Still 4 rx_discards here and on lan1. Not sure exactly when those
packets were discarded, or what those were.

Generally what I do to observe this kind of thing is to run
watch -n 1 "ethtool -S lan1 | grep -v ': 0'"

and see what actually increments, in real time.

It would be helpful if you could definitely say that those drops were
there even prior to you running the test (packets received by MAC while
port was down?), or if we need to look further into the problem there.

> ifconfig stats: (2 dropped packets on lan2.  Last time lan1 and lan2
> about roughly same RX counts, this time lan1 significantly less)

I've no idea where the 'dropped' packets as reported by ifconfig come
from. I'm almost certain it's not from DSA.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ