[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zi7+xqp1GG6Jl/kU@colin-ia-desktop>
Date: Sun, 28 Apr 2024 20:58:30 -0500
From: Colin Foster <colin.foster@...advantage.com>
To: Andrew Halaney <ahalaney@...hat.com>
Cc: Andrew Lunn <andrew@...n.ch>, netdev@...r.kernel.org,
linux-omap@...r.kernel.org
Subject: Re: Beaglebone Ethernet Probe Failure In 6.8+
Hi Andrew L and Andrew H,
Sorry for the delayed response. I couldn't get to testing anything until
just now.
On Tue, Apr 23, 2024 at 03:07:15PM -0500, Andrew Halaney wrote:
> On Tue, Apr 23, 2024 at 03:52:35PM +0200, Andrew Lunn wrote:
> > On Mon, Apr 22, 2024 at 11:00:51PM -0500, Colin Foster wrote:
> >
> > In these two last transactions, the ACK bit is not set.
> >
> > > [ 1.550471] SMSC LAN8710/LAN8720: probe of 4a101000.mdio:00 failed with error -5
> > > [ 1.550592] davinci_mdio 4a101000.mdio: phy[0]: device 4a101000.mdio:00, driver SMSC LAN8710/LAN8720
> > >
> > > Without the mdiodev->reset_state patch, I see the following:
> > >
> > > [ 1.537817] davinci_mdio 4a101000.mdio: davinci mdio revision 1.6, bus freq 1000000
> > > [ 1.538165] davinci mdio reg is 0x20400007
> > > [ 1.538426] davinci mdio reg is 0x2060c0f1
> >
> > Same as above.
> >
> > > [ 1.558442] davinci mdio reg is 0x23a00090
> > > [ 1.558717] davinci mdio reg is 0x20207809
> > > [ 1.559681] davinci mdio reg is 0x21c0ffff
> >
> > In all these cases, we see the ACK bit set.
> >
> > So the PHY is responding to registers 2 and 3, the ID registers. But
> > it seems to be failing to respond to other registers. At a guess, i
> > would say it is still coming out of reset. Does the datasheet for the
> > LAN8710/LAN8720 say anything about how long a reset takes? Can you get
> > a logic analyser onto the reset line and MDIO bus and see how
> > different the timing is? It might be you need to add some delay values
> > to the reset in DT.
I don't think I'll be able to get onto those lines. But I do think this
is the right tree to bark up. I also found some kernelci logs that
suggest I'm not the only one seeing this issue:
https://storage.kernelci.org/mainline/master/v6.9-rc5/arm/multi_v7_defconfig/gcc-10/lab-cip/baseline-beaglebone-black.html
There might be ways to navigate the kernelci database that I'm not aware
of, but I couldn't reasonably say "before 6.8 it didn't happen, and
after 6.8 it did." I'm not sure that matters at this point though.
>
> For what its worth, I think that this theory makes sense if reverting the patch
> highlighted above makes this go away. Before that patch, you'd see a
> flow like this:
>
> net: phy: mdio_device: Reset device only when necessary
>
> Currently the phy reset sequence is as shown below for a
> devicetree described mdio phy on boot:
>
> 1. Assert the phy_device's reset as part of registering
> 2. Deassert the phy_device's reset as part of registering
> 3. Deassert the phy_device's reset as part of phy_probe
> 4. Deassert the phy_device's reset as part of phy_hw_init
>
> Which means whatever the deassert time was tripled in
> practice before you got around to phy_hw_init() (which if I understand
> is when things start reporting no ACK above).
>
> I am not sure what devicetree upstream would be the one to look at for
> your beaglebone, but microchip's datasheet for the LAN8720A has
> "TABLE 5-8: POWER-ON NRST & ..." section detailing some reset requirements:
>
> https://ww1.microchip.com/downloads/en/devicedoc/00002165b.pdf
>
> If I read it right, assert time needs to be >= 100 us, and
> deassert... is not so clear to me unfortunately. Maybe for starters
> triple your value and see if things work ok (just based on the 3
> repeated deasserts going down to 1 with the patch applied)? Hopefully
> longer term the actual deassert timing can be confirmed.
I went all in and did a 100ms delay before returning from the resets of
3 and 4 you mention. Sure enough, everything worked! It certainly should
be understood and optimized. I added the linux-omap list to this thread
(please let me know if there were others I should've CC'd on any of
these emails).
Either way, thank you both for helping me understand this! I hope to be
able to fix the issue, but at the very least I hope it is considered
"reported".
Colin Foster
Powered by blists - more mailing lists