lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 5 Apr 2021 00:48:58 +0200
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Joakim Zhang <qiangqing.zhang@....com>, andrew@...n.ch,
        linux@...linux.org.uk, davem@...emloft.net, kuba@...nel.org
Cc:     netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-imx@....com, christian.melki@...ata.com
Subject: Re: [PATCH] net: phy: fix PHY possibly unwork after MDIO bus resume
 back

On 04.04.2021 16:09, Heiner Kallweit wrote:
> On 04.04.2021 12:07, Joakim Zhang wrote:
>> commit 4c0d2e96ba055 ("net: phy: consider that suspend2ram may cut
>> off PHY power") invokes phy_init_hw() when MDIO bus resume, it will
>> soft reset PHY if PHY driver implements soft_reset callback.
>> commit 764d31cacfe4 ("net: phy: micrel: set soft_reset callback to
>> genphy_soft_reset for KSZ8081") adds soft_reset for KSZ8081. After these
>> two patches, I found i.MX6UL 14x14 EVK which connected to KSZ8081RNB doesn't
>> work any more when system resume back, MAC driver is fec_main.c.
>>
>> It's obvious that initializing PHY hardware when MDIO bus resume back
>> would introduce some regression when PHY implements soft_reset. When I
> 
> Why is this obvious? Please elaborate on why a soft reset should break
> something.
> 
>> am debugging, I found PHY works fine if MAC doesn't support suspend/resume
>> or phy_stop()/phy_start() doesn't been called during suspend/resume. This
>> let me realize, PHY state machine phy_state_machine() could do something
>> breaks the PHY.
>>
>> As we known, MAC resume first and then MDIO bus resume when system
>> resume back from suspend. When MAC resume, usually it will invoke
>> phy_start() where to change PHY state to PHY_UP, then trigger the stat> machine to run now. In phy_state_machine(), it will start/config
>> auto-nego, then change PHY state to PHY_NOLINK, what to next is
>> periodically check PHY link status. When MDIO bus resume, it will
>> initialize PHY hardware, including soft_reset, what would soft_reset
>> affect seems various from different PHYs. For KSZ8081RNB, when it in
>> PHY_NOLINK state and then perform a soft reset, it will never complete
>> auto-nego.
> 
> Why? That would need to be checked in detail. Maybe chip errata
> documentation provides a hint.
> 

The KSZ8081 spec says the following about bit BMCR_PDOWN:

If software reset (Register 0.15) is
used to exit power-down mode
(Register 0.11 = 1), two software
reset writes (Register 0.15 = 1) are
required. The first write clears
power-down mode; the second
write resets the chip and re-latches
the pin strapping pin values.

Maybe this causes the issue you see and genphy_soft_reset() isn't
appropriate for this PHY. Please re-test with the KSZ8081 soft reset
following the spec comment.


>>
>> This patch changes PHY state to PHY_UP when MDIO bus resume back, it
>> should be reasonable after PHY hardware re-initialized. Also give state
>> machine a chance to start/config auto-nego again.
>>
> 
> If the MAC driver calls phy_stop() on suspend, then phydev->suspended
> is true and mdio_bus_phy_may_suspend() returns false. As a consequence
> phydev->suspended_by_mdio_bus is false and mdio_bus_phy_resume()
> skips the PHY hw initialization.
> Please also note that mdio_bus_phy_suspend() calls phy_stop_machine()
> that sets the state to PHY_UP.
> 

Forgot that MDIO bus suspend is done before MAC driver suspend.
Therefore disregard this part for now.

> Having said that the current argumentation isn't convincing. I'm not
> aware of such issues on other systems, therefore it's likely that
> something is system-dependent.
> 
> Please check the exact call sequence on your system, maybe it
> provides a hint.
> 
>> Signed-off-by: Joakim Zhang <qiangqing.zhang@....com>
>> ---
>>  drivers/net/phy/phy_device.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>> index cc38e326405a..312a6f662481 100644
>> --- a/drivers/net/phy/phy_device.c
>> +++ b/drivers/net/phy/phy_device.c
>> @@ -306,6 +306,13 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
>>  	ret = phy_resume(phydev);
>>  	if (ret < 0)
>>  		return ret;
>> +
>> +	/* PHY state could be changed to PHY_NOLINK from MAC controller resume
>> +	 * rounte with phy_start(), here change to PHY_UP after re-initializing
>> +	 * PHY hardware, let PHY state machine to start/config auto-nego again.
>> +	 */
>> +	phydev->state = PHY_UP;
>> +
>>  no_resume:
>>  	if (phydev->attached_dev && phydev->adjust_link)
>>  		phy_start_machine(phydev);
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ