[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZiBgvRKbxrVSu6rR@euler>
Date: Wed, 17 Apr 2024 18:52:29 -0500
From: Colin Foster <colin.foster@...advantage.com>
To: Andrew Lunn <andrew@...n.ch>
Cc: netdev@...r.kernel.org
Subject: Re: Beaglebone Ethernet Probe Failure In 6.8+
Hi Andrew,
On Wed, Apr 17, 2024 at 09:30:58PM +0200, Andrew Lunn wrote:
> On Wed, Apr 17, 2024 at 10:42:02AM -0500, Colin Foster wrote:
> > Hello,
> >
> > I'm chasing down an issue in recent kernels. My setup is slightly
> > unconventional: a BBB with ETH0 as a CPU port to a DSA switch that is
> > controlled by SPI. I'll have hardware next week, but think it is worth
> > getting a discussion going.
> >
> > The commit in question is commit df16c1c51d81 ("net: phy: mdio_device:
> > Reset device only when necessary"). This seems to cause a probe error of
> > the MDIO device. A dump_stack was added where the reset is skipped.
> >
> > SMSC LAN8710/LAN8720: probe of 4a101000.mdio:00 failed with error -5
>
> Can you confirm this EIO is this one:
>
> https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/ti/davinci_mdio.c#L440
>
> It would be good to check the value of USERACCESS_ACK, and what the
> datasheet says about it.
>
> The MDIO bus itself has no real way of telling if there is a device on
> the bus at a given address, and so if the devices actually transfers
> anything on a read. So if the resets are wrong, the device is still in
> reset, or coming out of reset but not yet ready, you should just read
> 0xffff. Returning EIO would indicate some other issue.
I'll look into this next week when I have hardware again.
>
> > Because this failure happens much earlier than DSA, I suspect is isn't
> > isolated to me and my setup - but I'm not positive at the moment.
> >
> > I suspect one of the following:
> >
> > 1. There's an issue with my setup / configuration.
> >
> > 2. This is an issue for every BBB device, but probe failures don't
> > actually break functionality.
> >
> >
> > Depending on which of those is the case, I'll either need to:
> >
> > A. revert the patch because it is causing probe failures
> >
> > B. determine why the probe is failing in the MDIO driver and try to fix
> > that
> >
> > C. Introduce an API to force resets, regardless of the previous state,
> > and apply that to the failure cases.
> >
> >
> > I assume the path forward is option B... but if the issue is more
> > widespread, options A or C might be the correct path.
>
> I would prefer B, at least lets try to understand the
> problem. Depending on what we find, we might need A, but lets decided
> that later.
Agreed.
>
> > [ 1.553623] SMSC LAN8710/LAN8720: probe of 4a101000.mdio:00 failed with error -5
> > [ 1.553762] davinci_mdio 4a101000.mdio: phy[0]: device 4a101000.mdio:00, driver SMSC LAN8710/LAN8720
> > [ 1.554978] cpsw-switch 4a100000.switch: initialized cpsw ale version 1.4
> > [ 1.555011] cpsw-switch 4a100000.switch: ALE Table size 1024
> > [ 1.555210] cpsw-switch 4a100000.switch: cpts: overflow check period 500 (jiffies)
> > [ 1.555234] cpsw-switch 4a100000.switch: CPTS: ref_clk_freq:250000000 calc_mult:2147483648 calc_shift:29 error:0 nsec/sec
> > [ 1.555343] cpsw-switch 4a100000.switch: Detected MACID = 24:76:25:76:35:37
> > [ 1.558098] cpsw-switch 4a100000.switch: initialized (regs 0x4a100000, pool size 256) hw_ver:0019010C 1.12 (0)
>
> So despite the -EIO, it finds the PHY, and the switch seems to probe
> O.K?
Yes. The issue I face is actually down the line when I enable the DSA
ports. I haven't diagnosed it yet, but a separate reset happens from
within phy_init_hw.
Here I've kept the dump_stack() from the patch, but removed the
return, so it is functional.
This is why it seems like it might be a bug that everyone is seeing, but
nobody is noticing... I hope to know more next week.
[ 8.581463] EXT4-fs (mmcblk0p2): re-mounted 084255e0-9101-48d6-af17-9601fd9c5a1d r/w. Quota mode: disabled.
[ 32.500235] cpsw-switch 4a100000.switch: starting ndev. mode: dual_mac
[ 32.522610] CPU: 0 PID: 166 Comm: ip Not tainted 6.7.0-rc3-00667-gdf16c1c51d81-dirty #1408
[ 32.530962] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 32.537090] Backtrace:
[ 32.539561] dump_backtrace from show_stack+0x20/0x24
[ 32.550363] show_stack from dump_stack_lvl+0x60/0x78
[ 32.555461] dump_stack_lvl from dump_stack+0x18/0x1c
[ 32.566238] dump_stack from mdio_device_reset+0xc4/0x108
[ 32.571685] mdio_device_reset from phy_init_hw+0x20/0xb8
[ 32.580713] phy_init_hw from phy_attach_direct+0x148/0x340
[ 32.589911] phy_attach_direct from phy_connect_direct+0x2c/0x68
[ 32.607416] phy_connect_direct from of_phy_connect+0x54/0x7c
[ 32.618889] of_phy_connect from cpsw_ndo_open+0x30c/0x4e4
[ 32.630096] cpsw_ndo_open from __dev_open+0xfc/0x1b0
[ 32.645608] __dev_open from __dev_change_flags+0x198/0x218
[ 32.656909] __dev_change_flags from dev_change_flags+0x28/0x64
[ 32.670656] dev_change_flags from do_setlink+0x258/0xed4
[ 32.681789] do_setlink from rtnl_newlink+0x544/0x87c
[ 32.697294] rtnl_newlink from rtnetlink_rcv_msg+0x138/0x318
[ 32.713408] rtnetlink_rcv_msg from netlink_rcv_skb+0xc8/0x12c
[ 32.729702] netlink_rcv_skb from rtnetlink_rcv+0x20/0x24
[ 32.740825] rtnetlink_rcv from netlink_unicast+0x1b0/0x2a4
[ 32.746435] netlink_unicast from netlink_sendmsg+0x1a4/0x408
[ 32.760001] netlink_sendmsg from ____sys_sendmsg+0xb8/0x2c4
[ 32.776110] ____sys_sendmsg from ___sys_sendmsg+0x7c/0xb4
[ 32.792046] ___sys_sendmsg from sys_sendmsg+0x60/0xa8
[ 32.803952] sys_sendmsg from ret_fast_syscall+0x0/0x1c
[ 32.809212] Exception stack(0xe0c3dfa8 to 0xe0c3dff0)
[ 32.814295] dfa0: 00000002 0054ecc8 00000003 bec65790 00000000 00000000
[ 32.822514] dfc0: 00000002 0054ecc8 b6f54880 00000128 00000000 00000001 bec65f32 bec65f35
[ 32.830731] dfe0: 00000128 bec65748 b6e4e52f b6dcce06
[ 32.835809] r6:b6f54880 r5:0054ecc8 r4:00000002
[ 32.979240] SMSC LAN8710/LAN8720 4a101000.mdio:00: attached PHY driver (mii_bus:phy_addr=4a101000.mdio:00, irq=POLL)
[ 32.994721] 8021q: adding VLAN 0 to HW filter on device eth0
[ 33.020751] ocelot-ext-switch ocelot-ext-switch.5.auto swp1: configuring for phy/internal link mode
[ 33.055444] ocelot-ext-switch ocelot-ext-switch.5.auto swp2: configuring for phy/internal link mode
[ 33.089784] ocelot-ext-switch ocelot-ext-switch.5.auto swp3: configuring for phy/internal link mode
[ 33.124241] ocelot-ext-switch ocelot-ext-switch.5.auto swp4: configuring for phy/qsgmii link mode
[ 33.161283] ocelot-ext-switch ocelot-ext-switch.5.auto swp5: configuring for phy/qsgmii link mode
[ 33.198704] ocelot-ext-switch ocelot-ext-switch.5.auto swp6: configuring for phy/qsgmii link mode
[ 33.235518] ocelot-ext-switch ocelot-ext-switch.5.auto swp7: configuring for phy/qsgmii link mode
Colin Foster
Powered by blists - more mailing lists