lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 19 Oct 2018 17:15:03 +0200
From:   Richard Genoud <richard.genoud@...il.com>
To:     Thomas Petazzoni <thomas.petazzoni@...tlin.com>
Cc:     Antoine Tenart <antoine.tenart@...tlin.com>,
        Gregory CLEMENT <gregory.clement@...tlin.com>,
        Yelena Krivosheev <yelena@...vell.com>,
        Maxime Chevallier <maxime.chevallier@...tlin.com>,
        Nicolas Ferre <nicolas.ferre@...rochip.com>,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: CRC errors between mvneta and macb

Hi all,

I've been struggling with a strange behavior between a clearfog-pro
and an at91sam9g35-ek boards.

TL;DR: ethernet frames are received with a CRC error on the clearfog
ETH0, but seem perfectly all right. Add a switch between the 2
boards, and the ethernet frames are accepted.


I've got a clearfog pro and an at91sam9g35-ek, both with kernel
4.19-rc8.
An RJ45 cable is plugged between the clearfog (on the solo port (eth0))
and the g35-ek board (100Mb/s).

They are configured with autoneg and a fixed IP address.

I start the 2 board, and, with the clearfog I ping the g35-ek.
If it succeeds, it will until the g35-ek is rebooted.
If it fails, it also will until the g35-ek is rebooted.

Rebooting the cleafog doesn't change anything.
Resetting the g35-ek PHY (mii-diag -R) doesn't change anything either.

When the ping fails, it's actually because the mvneta returns a CRC
error:
mvneta f1070000.ethernet eth0: bad rx status 0cc10000 (crc error), size=66

And, if I plug the RJ45 cable between the clearfog's matrix and the
g35-ek, everything works well, always.

To ease the debugging, instead of a ping I used:
https://gist.github.com/austinmarton/1922600
from the g35-ek in order to have the same frame every time.
So, I check with the scope the ethernet CRC (on the g35-ek PHY TXD[0-1]
(DM9161A)).
And the CRC is all right.

I also manage to trigger this bug by simply doing:
rmmod macb ; insmod macb.ko on the g35-ek.
Then, frames are accepted, or not.

I checked all PHY/macb register values on the g35-ek, they are the same.

The only thing I could find is related to the TXCLK on the PHY.

When there's a CRC error, the TXCLK has its polarity inverted...
That's a clue !

But this TXCLK (25MHz) is not used on the g35-ek.
Only the REFCLK/XT2 (50MHz) is used to synchronise the PHY and the macb.
So I guess that the TXCLK has a role to play to generate TX+/TX-

And I also guess that when the signal is converted back on the clearfog,
the clock polarity is somehow responsible for the CRC errors.

I was heading to get my scope on the clearfog's PHY to see what it
received, but Marvell's documentation is not as freely available as
Atmel's ones, so I'm quite stuck at this point.

Any idea ?

NB: I also managed to trigger this with an at91sam9g20-ek (but not with
a sama5d2)


Regards,
Richard

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ