[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250402134355.cbe673oejhoaor5p@skbuf>
Date: Wed, 2 Apr 2025 16:43:55 +0300
From: Vladimir Oltean <vladimir.oltean@....com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>,
Richard Cochran <richardcochran@...il.com>
Cc: netdev@...r.kernel.org, Andrew Lunn <andrew@...n.ch>,
Heiner Kallweit <hkallweit1@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Florian Fainelli <florian.fainelli@...adcom.com>,
Wei Fang <wei.fang@....com>
Subject: Re: [RFC PATCH net] net: phy: allow MDIO bus PM ops to start/stop
state machine for phylink-controlled PHY
On Mon, Mar 03, 2025 at 06:06:18PM +0000, Russell King (Oracle) wrote:
> On Tue, Feb 25, 2025 at 05:31:56PM +0200, Vladimir Oltean wrote:
> > DSA has 2 kinds of drivers:
> >
> > 1. Those who call dsa_switch_suspend() and dsa_switch_resume() from
> > their device PM ops: qca8k-8xxx, bcm_sf2, microchip ksz
> > 2. Those who don't: all others. The above methods should be optional.
> >
> > For type 1, dsa_switch_suspend() calls dsa_user_suspend() -> phylink_stop(),
> > and dsa_switch_resume() calls dsa_user_resume() -> phylink_start().
> > These seem good candidates for setting mac_managed_pm = true because
> > that is essentially its definition, but that does not seem to be the
> > biggest problem for now, and is not what this change focuses on.
> >
> > Talking strictly about the 2nd category of drivers here, I have noticed
> > that these also trigger the
> >
> > WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
> > phydev->state != PHY_UP);
> >
> > from mdio_bus_phy_resume(), because the PHY state machine is running.
> > It's running as a result of a previous dsa_user_open() -> ... ->
> > phylink_start() -> phy_start(), and AFAICS, mdio_bus_phy_suspend() was
> > supposed to have called phy_stop_machine(), but it didn't. So this is
> > why the PHY is in state PHY_NOLINK by the time mdio_bus_phy_resume()
> > runs.
> >
> > mdio_bus_phy_suspend() did not call phy_stop_machine() because for
> > phylink, the phydev->adjust_link function pointer is NULL. This seems a
> > technicality introduced by commit fddd91016d16 ("phylib: fix PAL state
> > machine restart on resume"). That commit was written before phylink
> > existed, and was intended to avoid crashing with consumer drivers which
> > don't use the PHY state machine - phylink does.
> >
> > Make the conditions dependent on the PHY device having a
> > phydev->phy_link_change() implementation equal to the default
> > phy_link_change() provided by phylib. Otherwise, just check that the
> > custom phydev->phy_link_change() has been provided and is non-NULL.
> > Phylink provides phylink_phy_change().
> >
> > Thus, we will stop the state machine even for phylink-controlled PHYs
> > when using the MDIO bus PM ops.
> >
> > Reported-by: Wei Fang <wei.fang@....com>
> > Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
> > ---
> > I've only spent a few hours debugging this, and I'm unsure which patch
> > to even blame. I haven't noticed other issues apart from the WARN_ON()
> > originally added by commit 744d23c71af3 ("net: phy: Warn about incorrect
> > mdio_bus_phy_resume() state").
>
> I think the commit looks correct to restore the intended behaviour,
> but I'm puzzled why we haven't seen this before.
>
> As for the right commit, you're correct that 744d23c71af3 brings the
> warning. Phylink was never tested with suspend/resume initially, and
> that's been something of an after-thought (I don't have platforms that
> support suspend/resume and phylink, so this is something for other
> people to test.)
>
> However, your patch also brings up another concern:
>
> commit 4715f65ffa0520af0680dbfbedbe349f175adaf4
> Author: Richard Cochran <richardcochran@...il.com>
> Date: Wed Dec 25 18:16:15 2019 -0800
>
> adding that call to MII timestamping stuff looks wrong to me - it means
> MII timestamping doesn't get to know about link state if phylink is
> being used. I'm not sure whether it needs to or not. Maybe Richard can
> comment.
>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Thanks for the review and for pointing this out.
If Richard does not respond to the comment request, I will submit the
attached patch to net-next once it reopens on Apr 7th. I will anyway
resubmit the PM-related change above to "net" today, without the RFC tag.
View attachment "0001-net-phy-extend-MII-timestamper-link-state-update-to-.patch" of type "text/x-diff" (3872 bytes)
Powered by blists - more mailing lists