lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <bdffa33c-e3eb-4c3b-adf3-99a02bc7d205@gmail.com>
Date: Mon, 22 Jan 2024 08:09:58 +0100
From: Rafał Miłecki <zajec5@...il.com>
To: Network Development <netdev@...r.kernel.org>
Cc: Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>,
 Russell King <linux@...linux.org.uk>, Robert Marko <robimarko@...il.com>,
 Ansuel Smith <ansuelsmth@...il.com>, Daniel Golle <daniel@...rotopia.org>
Subject: Race in PHY subsystem? Attaching to PHY devices before they get
 probed

Hi!

I have MT7988 SoC board with following problem:
[   26.887979] Aquantia AQR113C mdio-bus:08: aqr107_wait_reset_complete failed: -110

This issue is known to occur when PHY's firmware is not running. After
some debugging I discovered that .config_init() CB gets called while
.probe() CB is still being executed.

It turns out mtk_soc_eth.c calls phylink_of_phy_connect() before my PHY
gets fully probed and it seems there is nothing in PHY subsystem
verifying that. Please note this PHY takes quite some time to probe as
it involves sending firmware to hardware.

Is that a possible race in PHY subsystem?
Should we have phy_attach_direct() wait for PHY to be fully probed?



I verified this issue with following patch although -EPROBE_DEFER didn't
work automagically and I had to re-do "ifconfig eth2 up" manually.

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 3611ea64875e..3be499d2376b 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1437,6 +1437,11 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
  	bool using_genphy = false;
  	int err;

+	if (!phydev->probed) {
+		phydev_warn(phydev, "PHY is not ready yet!\n");
+		return -EPROBE_DEFER;
+	}
+
  	/* For Ethernet device drivers that register their own MDIO bus, we
  	 * will have bus->owner match ndev_mod, so we do not want to increment
  	 * our own module->refcnt here, otherwise we would not be able to
@@ -1562,6 +1567,8 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
  		phydev->devlink = device_link_add(dev->dev.parent, &phydev->mdio.dev,
  						  DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);

+	phydev->probed = true;
+
  	return err;

  error:
@@ -3382,6 +3389,9 @@ static int phy_probe(struct device *dev)
  	if (err)
  		phy_device_reset(phydev, 1);

+	if (!err)
+		phydev->probed = true;
+
  	return err;
  }

diff --git a/include/linux/phy.h b/include/linux/phy.h
index 684efaeca07c..29875a22ac89 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -634,6 +634,8 @@ struct macsec_ops;
   * handling, as well as handling shifts in PHY hardware state
   */
  struct phy_device {
+	bool probed;
+
  	struct mdio_device mdio;

  	/* Information about the PHY type */

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ