lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMRc=McF93F6YsQ+eT9oOe+c=2ZCQ3rBdj+-3Ruy8iO1B-syjw@mail.gmail.com>
Date: Thu, 18 Jul 2024 15:04:00 +0200
From: Bartosz Golaszewski <brgl@...ev.pl>
To: Jon Hunter <jonathanh@...dia.com>
Cc: Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>, 
	Russell King <linux@...linux.org.uk>, "David S . Miller" <davem@...emloft.net>, 
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, 
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Bartosz Golaszewski <bartosz.golaszewski@...aro.org>, 
	"linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>, Brad Griffis <bgriffis@...dia.com>
Subject: Re: [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the
 GLOBAL_CFG to start returning real values

On Thu, Jul 18, 2024 at 2:23 PM Jon Hunter <jonathanh@...dia.com> wrote:
>
> Hi Bartosz,
>
> On 08/07/2024 08:50, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
> >
> > When the PHY is first coming up (or resuming from suspend), it's
> > possible that although the FW status shows as running, we still see
> > zeroes in the GLOBAL_CFG set of registers and cannot determine available
> > modes. Since all models support 10M, add a poll and wait the config to
> > become available.
> >
> > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
> > ---
> >   drivers/net/phy/aquantia/aquantia_main.c | 8 +++++++-
> >   1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
> > index 974795bd0860..2c8ba2725a91 100644
> > --- a/drivers/net/phy/aquantia/aquantia_main.c
> > +++ b/drivers/net/phy/aquantia/aquantia_main.c
> > @@ -652,7 +652,13 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev)
> >       unsigned long *possible = phydev->possible_interfaces;
> >       unsigned int serdes_mode, rate_adapt;
> >       phy_interface_t interface;
> > -     int i, val;
> > +     int i, val, ret;
> > +
> > +     ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
> > +                                     VEND1_GLOBAL_CFG_10M, val, val != 0,
> > +                                     1000, 100000, false);
> > +     if (ret)
> > +             return ret;
> >
> >       /* Walk the media-speed configuration registers to determine which
> >        * host-side serdes modes may be used by the PHY depending on the
>
>
> With the current -next and mainline we are seeing the following issue on
> our Tegra234 Jetson AGX Orin platform ...
>
>   Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110
>   tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110)
>
>
> We have tracked it down to this change and looks like our PHY does not
> support 10M ...
>
> $ ethtool eth0
> Settings for eth0:
>          Supported ports: [  ]
>          Supported link modes:   100baseT/Full
>                                  1000baseT/Full
>                                  10000baseT/Full
>                                  1000baseKX/Full
>                                  10000baseKX4/Full
>                                  10000baseKR/Full
>                                  2500baseT/Full
>                                  5000baseT/Full
>
> The following fixes this for this platform ...
>
> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
> index d12e35374231..0b2db486d8bd 100644
> --- a/drivers/net/phy/aquantia/aquantia_main.c
> +++ b/drivers/net/phy/aquantia/aquantia_main.c
> @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev)
>          int i, val, ret;
>
>          ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
> -                                       VEND1_GLOBAL_CFG_10M, val, val != 0,
> +                                       VEND1_GLOBAL_CFG_100M, val, val != 0,
>                                          1000, 100000, false);
>          if (ret)
>                  return ret;
>
>
> However, I am not sure if this is guaranteed to work for all?

Ah cr*p. No, I don't think it is. We should take the first supported
mode for a given PHY I think.

>
> On a related note, we had also found an issue with this PHY where
> the PHY reset is not working quite correctly. What we see is that
> when polling for the firmware ID in aqr107_wait_reset_complete()
> is that value in the firware ID registers transitions from 0 to
> 0xffff and then to the firmware ID. We have been testing the
> following change to fix this ...
>
> diff --git a/drivers/net/phy/aquantia/aquantia.h b/drivers/net/phy/aquantia/aquantia.h
> index 2465345081f8..278e3b167c58 100644
> --- a/drivers/net/phy/aquantia/aquantia.h
> +++ b/drivers/net/phy/aquantia/aquantia.h
> @@ -20,6 +20,7 @@
>   #define VEND1_GLOBAL_FW_ID                     0x0020
>   #define VEND1_GLOBAL_FW_ID_MAJOR               GENMASK(15, 8)
>   #define VEND1_GLOBAL_FW_ID_MINOR               GENMASK(7, 0)
> +#define VEND1_GLOBAL_FW_ID_MASK                        GENMASK(15, 0)
>
>   #define VEND1_GLOBAL_MAILBOX_INTERFACE1                        0x0200
>   #define VEND1_GLOBAL_MAILBOX_INTERFACE1_EXECUTE                BIT(15)
> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
> index 0b2db486d8bd..5023fd70050d 100644
> --- a/drivers/net/phy/aquantia/aquantia_main.c
> +++ b/drivers/net/phy/aquantia/aquantia_main.c
> @@ -447,7 +447,9 @@ int aqr_wait_reset_complete(struct phy_device *phydev)
>          int val;
>
>          return phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
> -                                        VEND1_GLOBAL_FW_ID, val, val != 0,
> +                                        VEND1_GLOBAL_FW_ID, val,
> +                                        ((val & VEND1_GLOBAL_FW_ID_MASK) != 0 &&
> +                                        (val & VEND1_GLOBAL_FW_ID_MASK) != VEND1_GLOBAL_FW_ID_MASK),
>                                           20000, 2000000, false);
>   }
>
> I have not tried the resume use-case, but curious if this may
> also help here?
>

Interesting. Unfortunately this doesn't help with the suspend/resume
use-case if I revert my offending patch.

Bart

> Cheers
> Jon
>
> --
> nvpublic

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ