lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210526224329.raaxr6b2s2uid4dw@skbuf>
Date:   Thu, 27 May 2021 01:43:29 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     Oleksij Rempel <o.rempel@...gutronix.de>
Cc:     Woojung Huh <woojung.huh@...rochip.com>,
        UNGLinuxDriver@...rochip.com, Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>,
        Vivien Didelot <vivien.didelot@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, kernel@...gutronix.de,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        Russell King <linux@...linux.org.uk>,
        Michael Grzeschik <m.grzeschik@...gutronix.de>
Subject: Re: [PATCH net-next v3 4/9] net: phy: micrel: apply resume errata
 workaround for ksz8873 and ksz8863

On Wed, May 26, 2021 at 06:30:32AM +0200, Oleksij Rempel wrote:
> The ksz8873 and ksz8863 switches are affected by following errata:
> 
> | "Receiver error in 100BASE-TX mode following Soft Power Down"
> |
> | Some KSZ8873 devices may exhibit receiver errors after transitioning
> | from Soft Power Down mode to Normal mode, as controlled by register 195
> | (0xC3) bits [1:0]. When exiting Soft Power Down mode, the receiver
> | blocks may not start up properly, causing the PHY to miss data and
> | exhibit erratic behavior. The problem may appear on either port 1 or
> | port 2, or both ports. The problem occurs only for 100BASE-TX, not
> | 10BASE-T.
> |
> | END USER IMPLICATIONS
> | When the failure occurs, the following symptoms are seen on the affected
> | port(s):
> | - The port is able to link
> | - LED0 blinks, even when there is no traffic
> | - The MIB counters indicate receive errors (Rx Fragments, Rx Symbol
> |   Errors, Rx CRC Errors, Rx Alignment Errors)
> | - Only a small fraction of packets is correctly received and forwarded
> |   through the switch. Most packets are dropped due to receive errors.
> |
> | The failing condition cannot be corrected by the following:
> | - Removing and reconnecting the cable
> | - Hardware reset
> | - Software Reset and PCS Reset bits in register 67 (0x43)
> |
> | Work around:
> | The problem can be corrected by setting and then clearing the Port Power
> | Down bits (registers 29 (0x1D) and 45 (0x2D), bit 3). This must be done
> | separately for each affected port after returning from Soft Power Down
> | Mode to Normal Mode. The following procedure will ensure no further
> | issues due to this erratum. To enter Soft Power Down Mode, set register
> | 195 (0xC3), bits [1:0] = 10.
> |
> | To exit Soft Power Down Mode, follow these steps:
> | 1. Set register 195 (0xC3), bits [1:0] = 00 // Exit soft power down mode
> | 2. Wait 1ms minimum
> | 3. Set register 29 (0x1D), bit [3] = 1 // Enter PHY port 1 power down mode
> | 4. Set register 29 (0x1D), bit [3] = 0 // Exit PHY port 1 power down mode
> | 5. Set register 45 (0x2D), bit [3] = 1 // Enter PHY port 2 power down mode
> | 6. Set register 45 (0x2D), bit [3] = 0 // Exit PHY port 2 power down mode
> 
> This patch implements steps 2...6 of the suggested workaround. The first
> step needs to be implemented in the switch driver.

Am I right in understanding that register 195 (0xc3) is not a port register?

To hit the erratum, you have to enter Soft Power Down in the first place,
presumably by writing register 0xc3 from somewhere, right?

Where does Linux write this register from?

Once we find that place that enters/exits Soft Power Down mode, can't we
just toggle the Port Power Down bits for each port, exactly like the ERR
workaround says, instead of fooling around with a PHY driver?

> 
> Signed-off-by: Oleksij Rempel <o.rempel@...gutronix.de>
> ---
>  drivers/net/phy/micrel.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
> index 227d88db7d27..f03188ed953a 100644
> --- a/drivers/net/phy/micrel.c
> +++ b/drivers/net/phy/micrel.c
> @@ -1048,6 +1048,26 @@ static int ksz8873mll_config_aneg(struct phy_device *phydev)
>  	return 0;
>  }
>  
> +static int ksz886x_resume(struct phy_device *phydev)
> +{
> +	int ret;
> +
> +	/* Apply errata workaround for KSZ8863 and KSZ8873:
> +	 * Receiver error in 100BASE-TX mode following Soft Power Down
> +	 *
> +	 * When exiting Soft Power Down mode, the receiver blocks may not start
> +	 * up properly, causing the PHY to miss data and exhibit erratic
> +	 * behavior.
> +	 */
> +	usleep_range(1000, 2000);
> +
> +	ret = phy_set_bits(phydev, MII_BMCR, BMCR_PDOWN);
> +	if (ret)
> +		return ret;
> +
> +	return phy_clear_bits(phydev, MII_BMCR, BMCR_PDOWN);
> +}
> +
>  static int kszphy_get_sset_count(struct phy_device *phydev)
>  {
>  	return ARRAY_SIZE(kszphy_hw_stats);
> @@ -1401,7 +1421,7 @@ static struct phy_driver ksphy_driver[] = {
>  	/* PHY_BASIC_FEATURES */
>  	.config_init	= kszphy_config_init,
>  	.suspend	= genphy_suspend,
> -	.resume		= genphy_resume,
> +	.resume		= ksz886x_resume,

Are you able to explain the relation between the call paths of
phy_resume() and the lifetime of the Soft Power Down setting of the
switch? How do we know that the PHYs are resumed after the switch has
exited Soft Power Down mode?

>  }, {
>  	.name		= "Micrel KSZ87XX Switch",
>  	/* PHY_BASIC_FEATURES */
> -- 
> 2.29.2
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ