lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241113144229.3ff4bgsalvj7spb7@skbuf>
Date: Wed, 13 Nov 2024 16:42:29 +0200
From: Vladimir Oltean <olteanv@...il.com>
To: Tristram.Ha@...rochip.com
Cc: andrew@...n.ch, Woojung.Huh@...rochip.com, robh@...nel.org,
	krzk+dt@...nel.org, conor+dt@...nel.org, davem@...emloft.net,
	edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
	marex@...x.de, UNGLinuxDriver@...rochip.com,
	devicetree@...r.kernel.org, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next 2/2] net: dsa: microchip: Add SGMII port support
 to KSZ9477 switch

On Wed, Nov 13, 2024 at 02:12:36AM +0000, Tristram.Ha@...rochip.com wrote:
> When the SFP says it supports 1000Base-T sfp_add_phy() is called by the
> SFP state machine and phylink_sfp_connect_phy() and
> phylink_sfp_config_phy() are run.  It is in the last function that the
> validation fails as the just created phy device does not initialize its
> supported and advertising fields yet.  The phy device has the
> opportunity later to fill them up if the phylink creation goes through,
> but that never happens.
> 
> A fix is to fill those fields with sfp_support like this:
> 
> @@ -3228,6 +3228,11 @@ static int phylink_sfp_config_phy(struct
>     struct phylink_link_state config;
>     int ret;
> 
> +    /* The newly created PHY device has empty settings. */
> +    if (linkmode_empty(phy->supported)) {
> +        linkmode_copy(phy->supported, pl->sfp_support);
> +        linkmode_copy(phy->advertising, pl->sfp_support);
> +    }
>     linkmode_copy(support, phy->supported);
> 
>     memset(&config, 0, sizeof(config));
> 
> The provided PCS driver from the DSA driver has an opportunity to change
> support with its validation check, but that does not look right as
> generally those checks remove certain bits from the link mode, but this
> requires completely copying new ones.  And this still does not work as
> the advertising field passed to the PCS driver has a const modifier.

I think I know what's happening, it's unfortunate it pushed you towards
wrong conclusions.

The "fix" you posted is wrong, and no, the PCS driver should not expand
the supported mask, just restrict it as you said. The phydev->supported
mask normally comes from the phy_probe() logic:

	/* Start out supporting everything. Eventually,
	 * a controller will attach, and may modify one
	 * or both of these values
	 */
	if (phydrv->features) {
		linkmode_copy(phydev->supported, phydrv->features);
		genphy_c45_read_eee_abilities(phydev);
	}
	else if (phydrv->get_features)
		err = phydrv->get_features(phydev);
	else if (phydev->is_c45)
		err = genphy_c45_pma_read_abilities(phydev);
	else
		err = genphy_read_abilities(phydev);

The SFP bus code depends strictly on sfp_sm_probe_phy() -> phy_device_register()
actually loading a precise device driver for the PHY synchronously via
phy_bus_match(). There is another lazy loading mechanism later in
phy_attach_direct(), for the Generic PHY driver:

	/* Assume that if there is no driver, that it doesn't
	 * exist, and we should use the genphy driver.
	 */

but that is too late for this code path, because as you say,
phylink_sfp_config_phy() is coded up to only call phylink_attach_phy()
if phylink_validate() succeeds. But phylink_validate() will only see a
valid phydev->supported mask with the Generic PHY driver if we let that
driver attach in phylink_attach_phy() in the first place.

Personally, I think SFP modules with embedded PHYs strictly require the
matching driver to be available to the kernel, due to that odd way in
which the Generic PHY driver is loaded, but I will let the PHY library
experts share their opinion as well.

You would be better off improving the error message, see what PHY ID you
get, then find and load the driver for it:

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 7dbcbf0a4ee2..8be473a7d262 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1817,9 +1817,12 @@ static int sfp_sm_probe_phy(struct sfp *sfp, int addr, bool is_c45)
 
 	err = sfp_add_phy(sfp->sfp_bus, phy);
 	if (err) {
+		dev_err(sfp->dev,
+			"sfp_add_phy() for PHY %s (ID 0x%.8lx) failed: %pe, maybe PHY driver not loaded?\n",
+			phydev_name(phy), (unsigned long)phy->phy_id,
+			ERR_PTR(err));
 		phy_device_remove(phy);
 		phy_device_free(phy);
-		dev_err(sfp->dev, "sfp_add_phy failed: %pe\n", ERR_PTR(err));
 		return err;
 	}
 

Chances are it's one of CONFIG_MARVELL_PHY or CONFIG_AQUANTIA_PHY.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ