lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240806112747.soclko5vex2f2c64@skbuf>
Date: Tue, 6 Aug 2024 14:27:47 +0300
From: Vladimir Oltean <vladimir.oltean@....com>
To: Jon Hunter <jonathanh@...dia.com>
Cc: Bartosz Golaszewski <brgl@...ev.pl>, Andrew Lunn <andrew@...n.ch>,
	Heiner Kallweit <hkallweit1@...il.com>,
	Russell King <linux@...linux.org.uk>,
	"David S . Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
	Bartosz Golaszewski <bartosz.golaszewski@...aro.org>,
	"linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>,
	Brad Griffis <bgriffis@...dia.com>,
	Clark Wang <xiaoning.wang@....com>, Wei Fang <wei.fang@....com>
Subject: Re: [RESEND PATCH net-next v3 2/4] net: phy: aquantia: wait for FW
 reset before checking the vendor ID

Hi Jon,

On Tue, Jul 30, 2024 at 10:59:59AM +0100, Jon Hunter wrote:
> Hi Bartosz,
> 
> On 08/07/2024 08:50, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
> > 
> > Checking the firmware register before it complete the boot process makes
> > no sense, it will report 0 even if FW is available from internal memory.
> > Always wait for FW to boot before continuing or we'll unnecessarily try
> > to load it from nvmem/filesystem and fail.
> > 
> > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
> > ---
> >   drivers/net/phy/aquantia/aquantia_firmware.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c b/drivers/net/phy/aquantia/aquantia_firmware.c
> > index 0c9640ef153b..524627a36c6f 100644
> > --- a/drivers/net/phy/aquantia/aquantia_firmware.c
> > +++ b/drivers/net/phy/aquantia/aquantia_firmware.c
> > @@ -353,6 +353,10 @@ int aqr_firmware_load(struct phy_device *phydev)
> >   {
> >   	int ret;
> > +	ret = aqr_wait_reset_complete(phydev);
> > +	if (ret)
> > +		return ret;
> > +
> >   	/* Check if the firmware is not already loaded by pooling
> >   	 * the current version returned by the PHY. If 0 is returned,
> >   	 * no firmware is loaded.
> 
> 
> Although this fixed another issue we were seeing with this driver, we have
> been reviewing this change and have a question about it.
> 
> According to the description for the function aqr_wait_reset_complete() this
> function is intended to give the device time to load firmware and check
> there is a valid firmware ID.
> 
> If a valid firmware ID (non-zero) is detected, then
> aqr_wait_reset_complete() will return 0 (because phy_read_mmd_poll_timeout()
> returns 0 on success and -ETIMEDOUT upon a timeout).
> 
> If it times out, then it would appear that with the above code we don't
> attempt to load the firmware by any other means?
> 
> Hence, I was wondering if we want this ...
> 
> diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c
> b/drivers/net/phy/aquantia/aquantia_firmware.c
> index 524627a36c6f..a167f42ae36b 100644
> --- a/drivers/net/phy/aquantia/aquantia_firmware.c
> +++ b/drivers/net/phy/aquantia/aquantia_firmware.c
> @@ -353,16 +353,12 @@ int aqr_firmware_load(struct phy_device *phydev)
>  {
>         int ret;
> 
> -       ret = aqr_wait_reset_complete(phydev);
> -       if (ret)
> -               return ret;
> -
> -       /* Check if the firmware is not already loaded by pooling
> +       /* Check if the firmware is not already loaded by polling
>          * the current version returned by the PHY. If 0 is returned,
> -        * no firmware is loaded.
> +        * firmware is loaded.
>          */
> -       ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_FW_ID);
> -       if (ret > 0)
> +       ret = aqr_wait_reset_complete(phydev);
> +       if (!ret)
>                 goto exit;
> 
>         ret = aqr_firmware_load_nvmem(phydev);

I agree with your analysis and we also noticed this.

But actually, you wouldn't want to ignore other return codes from
phy_read_mmd_poll_timeout() like real errors from phy_read_mmd():
-ENODEV, -ENXIO etc.

I found that the logic is more readable with a switch/case statement as below.

diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c b/drivers/net/phy/aquantia/aquantia_firmware.c
index 524627a36c6f..d839f64471bd 100644
--- a/drivers/net/phy/aquantia/aquantia_firmware.c
+++ b/drivers/net/phy/aquantia/aquantia_firmware.c
@@ -353,26 +353,33 @@ int aqr_firmware_load(struct phy_device *phydev)
 {
 	int ret;
 
-	ret = aqr_wait_reset_complete(phydev);
-	if (ret)
-		return ret;
-
-	/* Check if the firmware is not already loaded by pooling
-	 * the current version returned by the PHY. If 0 is returned,
-	 * no firmware is loaded.
+	/* Check if the firmware is not already loaded by polling
+	 * the current version returned by the PHY.
 	 */
-	ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_FW_ID);
-	if (ret > 0)
-		goto exit;
+	ret = aqr_wait_reset_complete(phydev);
+	switch (ret) {
+	case 0:
+		/* Some firmware is loaded => do nothing */
+		return 0;
+	case -ETIMEDOUT:
+		/* VEND1_GLOBAL_FW_ID still reads 0 after 2 seconds of polling.
+		 * We don't have full confidence that no firmware is loaded (in
+		 * theory it might just not have loaded yet), but we will
+		 * assume that, and load a new image.
+		 */
+		ret = aqr_firmware_load_nvmem(phydev);
+		if (!ret)
+			goto exit;
 
-	ret = aqr_firmware_load_nvmem(phydev);
-	if (!ret)
-		goto exit;
+		ret = aqr_firmware_load_fs(phydev);
+		if (ret)
+			return ret;
 
-	ret = aqr_firmware_load_fs(phydev);
-	if (ret)
+		break;
+	default:
+		/* PHY read error, propagate it to the caller */
 		return ret;
+	}
 
-exit:
 	return 0;
 }

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ