lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 16 Oct 2023 13:49:25 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Miquel Raynal <miquel.raynal@...tlin.com>
Cc: "Russell King (Oracle)" <linux@...linux.org.uk>, Wei Fang <wei.fang@....com>, 
	Shenwei Wang <shenwei.wang@....com>, Clark Wang <xiaoning.wang@....com>, davem@...emloft.net, 
	kuba@...nel.org, pabeni@...hat.com, linux-imx@....com, netdev@...r.kernel.org, 
	Thomas Petazzoni <thomas.petazzoni@...tlin.com>, 
	Alexandre Belloni <alexandre.belloni@...tlin.com>, 
	Maxime Chevallier <maxime.chevallier@...tlin.com>, Andrew Lunn <andrew@...n.ch>, 
	Stephen Hemminger <stephen@...workplumber.org>
Subject: Re: Ethernet issue on imx6

On Fri, Oct 13, 2023 at 10:40 AM Miquel Raynal
<miquel.raynal@...tlin.com> wrote:
>
> Hi Russell,
>
> linux@...linux.org.uk wrote on Thu, 12 Oct 2023 20:39:11 +0100:
>
> > On Thu, Oct 12, 2023 at 07:34:10PM +0200, Miquel Raynal wrote:
> > > Hello,
> > >
> > > I've been scratching my foreheads for weeks on a strange imx6
> > > network issue, I need help to go further, as I feel a bit clueless now.
> > >
> > > Here is my setup :
> > > - Custom imx6q board
> > > - Bootloader: U-Boot 2017.11 (also tried with a 2016.03)
> > > - Kernel : 4.14(.69,.146,.322), v5.10 and v6.5 with the same behavior
> > > - The MAC (fec driver) is connected to a Micrel 9031 PHY
> > > - The PHY is connected to the link partner through an industrial cable
> >
> > "industrial cable" ?
>
> It is a "unique" hardware cable, the four Ethernet pairs are foiled
> twisted pair each and the whole cable is shielded. Additionally there
> is the 24V power supply coming from this cable. The connector is from
> ODU S22LOC-P16MCD0-920S. The structure of the cable should be similar
> to a CAT7 cable with the additional power supply line.
>
> > > - Testing 100BASE-T (link is stable)
> >
> > Would that be full or half duplex?
>
> Ah, yeah, sorry for forgetting this detail, it's full duplex.
>
> > > The RGMII-ID timings are probably not totally optimal but offer
> > > rather good performance. In UDP with iperf3:
> > > * Downlink (host to the board) runs at full speed with 0% drop
> > > * Uplink (board to host) runs at full speed with <1% drop
> > >
> > > However, if I ever try to limit the bandwidth in uplink (only), the
> > > drop rate rises significantly, up to 30%:
> > >
> > > //192.168.1.1 is my host, so the below lines are from the board:
> > > # iperf3 -c 192.168.1.1 -u -b100M
> > > [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms
> > > 467/82603 (0.57%)  receiver # iperf3 -c 192.168.1.1 -u -b90M
> > > [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms
> > > 12163/77688 (16%)  receiver # iperf3 -c 192.168.1.1 -u -b80M
> > > [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms
> > > 20937/69055 (30%)  receiver
> >
> > My setup:
> >
> > i.MX6DL silicon rev 1.3
> > Atheros AR8035 PHY
> > 6.3.0+ (no significant changes to fec_main.c)
> > Link, being BASE-T, is standard RJ45.
> >
> > Connectivity is via a bridge device (sorry, can't change that as it
> > would be too disruptive, as this is my Internet router!)
> >
> > Running at 1000BASE-T (FD):
> > [ ID] Interval           Transfer     Bitrate         Jitter
> > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.4
> > Mbits/sec  0.030 ms  0/82363 (0%)  receiver [  5]   0.00-10.00  sec
> > 107 MBytes  90.0 Mbits/sec  0.103 ms  0/77691 (0%)  receiver [  5]
> > 0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.101 ms  0/69060 (0%)
> > receiver
> >
> > Running at 100BASE-Tx (FD):
> > [ ID] Interval           Transfer     Bitrate         Jitter
> > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.4
> > Mbits/sec  0.008 ms  0/82436 (0%)  receiver [  5]   0.00-10.00  sec
> > 107 MBytes  90.0 Mbits/sec  0.088 ms  0/77692 (0%)  receiver [  5]
> > 0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.108 ms  0/69058 (0%)
> > receiver
> >
> > Running at 100bASE-Tx (HD):
> > [ ID] Interval           Transfer     Bitrate         Jitter
> > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.3
> > Mbits/sec  0.056 ms  0/82304 (0%)  receiver [  5]   0.00-10.00  sec
> > 107 MBytes  90.0 Mbits/sec  0.101 ms  1/77691 (0.0013%)  receiver [
> > 5]   0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.105 ms  0/69058
> > (0%)  receiver
> >
> > So I'm afraid I don't see your issue.
>
> I believe the issue cannot be at an higher level than the MAC. I also
> do not think the MAC driver and PHY driver are specifically buggy. I
> ruled out the hardware issue given the fact that under certain
> conditions (high load) the network works rather well... But I certainly
> see this issue, and when switching to TCP the results are dramatic:
>
> # iperf3 -c 192.168.1.1
> Connecting to host 192.168.1.1, port 5201
> [  5] local 192.168.1.2 port 37948 connected to 192.168.1.1 port 5201
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.00   sec  11.3 MBytes  94.5 Mbits/sec   43   32.5 KBytes
> [  5]   1.00-2.00   sec  3.29 MBytes  27.6 Mbits/sec   26   1.41 KBytes
> [  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> [  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> [  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    5   1.41 KBytes
> [  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> [  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> [  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> [  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> [  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
>
> Thanks,
> Miquèl

Can you experiment with :

- Disabling TSO on your NIC (ethtool -K eth0 tso off)
- Reducing max GSO size (ip link set dev eth0 gso_max_size 16384)

I suspect some kind of issues with fec TX completion, vs TSO emulation.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ