lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <74cef958-9513-40d7-8da4-7a566ba47291@lunn.ch>
Date:   Mon, 20 Mar 2023 13:00:57 +0100
From:   Andrew Lunn <andrew@...n.ch>
To:     "Buzarra, Arturo" <Arturo.Buzarra@...i.com>
Cc:     Heiner Kallweit <hkallweit1@...il.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Florian Fainelli <f.fainelli@...il.com>,
        Russell King - ARM Linux <linux@...linux.org.uk>
Subject: Re: [PATCH] net: phy: return EPROBE_DEFER if PHY is not accessible

On Mon, Mar 20, 2023 at 09:45:38AM +0000, Buzarra, Arturo wrote:
> Hi,
> 
> I will try to answer all your questions:
> 
> - "We need more specifics here, what type of PHY device are you seeing this with?"
> - " So best start with some details about your use case, which MAC, which PHY, etc"
> I'm using a LAN8720A PHY (10/100 on RMII mode) with a ST MAC ( in particular is a stm32mp1 processor).
> We have two PHYs one is a Gigabit PHY (RGMII mode) and the another one is a 10/100 (RMII mode).

Where is the clock coming from? Does each PHY have its own crystal? Is
the clock coming from one of the MACs? Is one PHY a clock source and
the other a clock sync?

> In the boot process, I think that there is a race condition between
> configuring the Ethernet MACs and the two PHYs. At same point the
> RGMII Ethernet MAC is configured and starts the PHY probes.

You have two MACs. Do you have two MDIO busses, with one PHY on each
bus, or do you have just one MDIO bus in use, with two PHYs on it?

Please show us your Device Tree description of the hardware.

> When the 10/100 PHY starts the probe, it tries to read the
> genphy_read_abilities() and always reads 0xFFFF ( I assume that this
> is the default electrical values for that lines before it are
> configured).

There is a pull up on the MDIO data line, so that if nothing drives it
low, it reads 1. This was one of Florian comments. Have you check the
value of that pull up resistor?

> - " Does the device reliably enumerate on the bus, i.e. reading registers 2 and 3 to get its ID?"
> Reading the registers PHYSID1 and PHYSID2 reports also 0xFFFF

Which is odd, because the MDIO bus probe code would assume there is no
PHY there given those two values, and then would not bother trying to
read the abilities. So you are somehow forcing the core to assume
there is a PHY there. Do you have the PHY ID in DT?

> - " If the PHY is broken, by some yet to be determined definition of broken, we try to limit the workaround to as narrow as possible. So it should not be in the core probe code. It should be in the PHY specific driver, and ideally for only its ID, not the whole vendors family of PHYs"
> I have several workarounds/fixed for that, the easy way is set the PHY capabilities in the smsc.c driver " .features	= PHY_BASIC_FEATURES" like in the past and it works fine. Also I have another fix in the same driver adding a custom function for " get_features" where if I read 0xFFFF or 0x0000 return a EPROBE_DEFER. However after review another drivers (net/usb/pegasus.c , net/Ethernet/toshiba/spider_net.c, net/sis/sis190.c, and more...)  that also checks the result of read MII_BMSR against 0x0000 and 0xFFFF , I tried to send a common fix in the PHY core. From my point of view read a 0x0000 or 0xFFFF value is an error in the probe step like if the phy_read() returns an error, so I think that the PHY core should consider return a EPROBE_DEFER to at least provide a second try for that PHY device.

We first want to understand the problem before adding such hacks. It
really sounds like something the PHY needs is missing, a clock, time
after a reset, etc. If we can figure that out, we can avoid such
hacks.

	Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ