lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z8Xvmqp2sukNPzvt@shell.armlinux.org.uk>
Date: Mon, 3 Mar 2025 18:06:18 +0000
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Vladimir Oltean <vladimir.oltean@....com>,
	Richard Cochran <richardcochran@...il.com>
Cc: netdev@...r.kernel.org, Andrew Lunn <andrew@...n.ch>,
	Heiner Kallweit <hkallweit1@...il.com>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Florian Fainelli <florian.fainelli@...adcom.com>,
	Wei Fang <wei.fang@....com>
Subject: Re: [RFC PATCH net] net: phy: allow MDIO bus PM ops to start/stop
 state machine for phylink-controlled PHY

On Tue, Feb 25, 2025 at 05:31:56PM +0200, Vladimir Oltean wrote:
> DSA has 2 kinds of drivers:
> 
> 1. Those who call dsa_switch_suspend() and dsa_switch_resume() from
>    their device PM ops: qca8k-8xxx, bcm_sf2, microchip ksz
> 2. Those who don't: all others. The above methods should be optional.
> 
> For type 1, dsa_switch_suspend() calls dsa_user_suspend() -> phylink_stop(),
> and dsa_switch_resume() calls dsa_user_resume() -> phylink_start().
> These seem good candidates for setting mac_managed_pm = true because
> that is essentially its definition, but that does not seem to be the
> biggest problem for now, and is not what this change focuses on.
> 
> Talking strictly about the 2nd category of drivers here, I have noticed
> that these also trigger the
> 
> 	WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
> 		phydev->state != PHY_UP);
> 
> from mdio_bus_phy_resume(), because the PHY state machine is running.
> It's running as a result of a previous dsa_user_open() -> ... ->
> phylink_start() -> phy_start(), and AFAICS, mdio_bus_phy_suspend() was
> supposed to have called phy_stop_machine(), but it didn't. So this is
> why the PHY is in state PHY_NOLINK by the time mdio_bus_phy_resume()
> runs.
> 
> mdio_bus_phy_suspend() did not call phy_stop_machine() because for
> phylink, the phydev->adjust_link function pointer is NULL. This seems a
> technicality introduced by commit fddd91016d16 ("phylib: fix PAL state
> machine restart on resume"). That commit was written before phylink
> existed, and was intended to avoid crashing with consumer drivers which
> don't use the PHY state machine - phylink does.
> 
> Make the conditions dependent on the PHY device having a
> phydev->phy_link_change() implementation equal to the default
> phy_link_change() provided by phylib. Otherwise, just check that the
> custom phydev->phy_link_change() has been provided and is non-NULL.
> Phylink provides phylink_phy_change().
> 
> Thus, we will stop the state machine even for phylink-controlled PHYs
> when using the MDIO bus PM ops.
> 
> Reported-by: Wei Fang <wei.fang@....com>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
> ---
> I've only spent a few hours debugging this, and I'm unsure which patch
> to even blame. I haven't noticed other issues apart from the WARN_ON()
> originally added by commit 744d23c71af3 ("net: phy: Warn about incorrect
> mdio_bus_phy_resume() state").

I think the commit looks correct to restore the intended behaviour,
but I'm puzzled why we haven't seen this before.

As for the right commit, you're correct that 744d23c71af3 brings the
warning. Phylink was never tested with suspend/resume initially, and
that's been something of an after-thought (I don't have platforms that
support suspend/resume and phylink, so this is something for other
people to test.)

However, your patch also brings up another concern:

commit 4715f65ffa0520af0680dbfbedbe349f175adaf4
Author: Richard Cochran <richardcochran@...il.com>
Date:   Wed Dec 25 18:16:15 2019 -0800

adding that call to MII timestamping stuff looks wrong to me - it means
MII timestamping doesn't get to know about link state if phylink is
being used. I'm not sure whether it needs to or not. Maybe Richard can
comment.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ