lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DB8PR04MB6795F52C1E319BE836BEF529E6769@DB8PR04MB6795.eurprd04.prod.outlook.com>
Date:   Tue, 6 Apr 2021 02:06:03 +0000
From:   Joakim Zhang <qiangqing.zhang@....com>
To:     "christian.melki@...ata.com" <christian.melki@...ata.com>,
        Heiner Kallweit <hkallweit1@...il.com>,
        "andrew@...n.ch" <andrew@...n.ch>,
        "linux@...linux.org.uk" <linux@...linux.org.uk>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "kuba@...nel.org" <kuba@...nel.org>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        dl-linux-imx <linux-imx@....com>
Subject: RE: [PATCH] net: phy: fix PHY possibly unwork after MDIO bus resume
 back


Hi Charistian,

> -----Original Message-----
> From: Christian Melki <christian.melki@...ata.com>
> Sent: 2021年4月5日 16:44
> To: Heiner Kallweit <hkallweit1@...il.com>; Joakim Zhang
> <qiangqing.zhang@....com>; andrew@...n.ch; linux@...linux.org.uk;
> davem@...emloft.net; kuba@...nel.org
> Cc: netdev@...r.kernel.org; linux-kernel@...r.kernel.org; dl-linux-imx
> <linux-imx@....com>
> Subject: Re: [PATCH] net: phy: fix PHY possibly unwork after MDIO bus resume
> back
> 
> On 4/5/21 12:48 AM, Heiner Kallweit wrote:
> > On 04.04.2021 16:09, Heiner Kallweit wrote:
> >> On 04.04.2021 12:07, Joakim Zhang wrote:
> >>> commit 4c0d2e96ba055 ("net: phy: consider that suspend2ram may cut
> >>> off PHY power") invokes phy_init_hw() when MDIO bus resume, it will
> >>> soft reset PHY if PHY driver implements soft_reset callback.
> >>> commit 764d31cacfe4 ("net: phy: micrel: set soft_reset callback to
> >>> genphy_soft_reset for KSZ8081") adds soft_reset for KSZ8081. After
> >>> these two patches, I found i.MX6UL 14x14 EVK which connected to
> >>> KSZ8081RNB doesn't work any more when system resume back, MAC
> driver is fec_main.c.
> >>>
> >>> It's obvious that initializing PHY hardware when MDIO bus resume
> >>> back would introduce some regression when PHY implements soft_reset.
> >>> When I
> >>
> >> Why is this obvious? Please elaborate on why a soft reset should
> >> break something.
> >>
> >>> am debugging, I found PHY works fine if MAC doesn't support
> >>> suspend/resume or phy_stop()/phy_start() doesn't been called during
> >>> suspend/resume. This let me realize, PHY state machine
> >>> phy_state_machine() could do something breaks the PHY.
> >>>
> >>> As we known, MAC resume first and then MDIO bus resume when system
> >>> resume back from suspend. When MAC resume, usually it will invoke
> >>> phy_start() where to change PHY state to PHY_UP, then trigger the
> >>> stat> machine to run now. In phy_state_machine(), it will
> >>> start/config auto-nego, then change PHY state to PHY_NOLINK, what to
> >>> next is periodically check PHY link status. When MDIO bus resume, it
> >>> will initialize PHY hardware, including soft_reset, what would
> >>> soft_reset affect seems various from different PHYs. For KSZ8081RNB,
> >>> when it in PHY_NOLINK state and then perform a soft reset, it will never
> complete auto-nego.
> >>
> >> Why? That would need to be checked in detail. Maybe chip errata
> >> documentation provides a hint.
> >>
> >
> > The KSZ8081 spec says the following about bit BMCR_PDOWN:
> >
> > If software reset (Register 0.15) is
> > used to exit power-down mode
> > (Register 0.11 = 1), two software
> > reset writes (Register 0.15 = 1) are
> > required. The first write clears
> > power-down mode; the second
> > write resets the chip and re-latches
> > the pin strapping pin values.
> >
> > Maybe this causes the issue you see and genphy_soft_reset() isn't
> > appropriate for this PHY. Please re-test with the KSZ8081 soft reset
> > following the spec comment.
> >
> 
> Interesting. Never expected that behavior.
> Thanks for catching it. Skimmed through the datasheets/erratas.
> This is what I found (micrel.c):
> 
> 10/100:
> 8001 - Unaffected?
> 8021/8031 - Double reset after PDOWN.
> 8041 - Errata. PDOWN broken. Recommended do not use. Unclear if reset
> solves the issue since errata says no error after reset but is also claiming that
> only toggling PDOWN (may) or power will help.
> 8051 - Double reset after PDOWN.
> 8061 - Double reset after PDOWN.
> 8081 - Double reset after PDOWN.
> 8091 - Double reset after PDOWN.
> 
> 10/100/1000:
> Nothing in gigabit afaics.
> 
> Switches:
> 8862 - Not affected?
> 8863 - Errata. PDOWN broken. Reset will not help. Workaround exists.
> 8864 - Not affected?
> 8873 - Errata. PDOWN broken. Reset will not help. Workaround exists.
> 9477 - Errata. PDOWN broken. Will randomly cause link failure on adjacent links.
> Do not use.
> 
> This certainly explains a lot.

Thanks for digging into it. As I discussed with you before, there is no problem with these two fixes if I did ifdown/ifup.
Almost the same route with suspend/resume. Difference is that it will start state machine after initializing PHY. But when
resume back, state machine is running before initializing PHY. I think the key is to figure out what would soft reset affect
in 8081, there is no any hint in spec.

Best Regards,
Joakim Zhang
> >>>
> >>> This patch changes PHY state to PHY_UP when MDIO bus resume back, it
> >>> should be reasonable after PHY hardware re-initialized. Also give
> >>> state machine a chance to start/config auto-nego again.
> >>>
> >>
> >> If the MAC driver calls phy_stop() on suspend, then phydev->suspended
> >> is true and mdio_bus_phy_may_suspend() returns false. As a
> >> consequence
> >> phydev->suspended_by_mdio_bus is false and mdio_bus_phy_resume()
> >> skips the PHY hw initialization.
> >> Please also note that mdio_bus_phy_suspend() calls phy_stop_machine()
> >> that sets the state to PHY_UP.
> >>
> >
> > Forgot that MDIO bus suspend is done before MAC driver suspend.
> > Therefore disregard this part for now.
> >
> >> Having said that the current argumentation isn't convincing. I'm not
> >> aware of such issues on other systems, therefore it's likely that
> >> something is system-dependent.
> >>
> >> Please check the exact call sequence on your system, maybe it
> >> provides a hint.
> >>
> >>> Signed-off-by: Joakim Zhang <qiangqing.zhang@....com>
> >>> ---
> >>>  drivers/net/phy/phy_device.c | 7 +++++++
> >>>  1 file changed, 7 insertions(+)
> >>>
> >>> diff --git a/drivers/net/phy/phy_device.c
> >>> b/drivers/net/phy/phy_device.c index cc38e326405a..312a6f662481
> >>> 100644
> >>> --- a/drivers/net/phy/phy_device.c
> >>> +++ b/drivers/net/phy/phy_device.c
> >>> @@ -306,6 +306,13 @@ static __maybe_unused int
> mdio_bus_phy_resume(struct device *dev)
> >>>  	ret = phy_resume(phydev);
> >>>  	if (ret < 0)
> >>>  		return ret;
> >>> +
> >>> +	/* PHY state could be changed to PHY_NOLINK from MAC controller
> resume
> >>> +	 * rounte with phy_start(), here change to PHY_UP after
> re-initializing
> >>> +	 * PHY hardware, let PHY state machine to start/config auto-nego
> again.
> >>> +	 */
> >>> +	phydev->state = PHY_UP;
> >>> +
> >>>  no_resume:
> >>>  	if (phydev->attached_dev && phydev->adjust_link)
> >>>  		phy_start_machine(phydev);
> >>>
> >>
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ