lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aQjAeCNGA2cjNXy6@oss.qualcomm.com>
Date: Mon, 3 Nov 2025 20:17:20 +0530
From: Mohd Ayaan Anwar <mohd.anwar@....qualcomm.com>
To: Vladimir Oltean <olteanv@...il.com>
Cc: "Russell King (Oracle)" <linux@...linux.org.uk>,
        Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>,
        Alexandre Torgue <alexandre.torgue@...s.st.com>,
        Alexis Lothoré <alexis.lothore@...tlin.com>,
        Andrew Lunn <andrew+netdev@...n.ch>,
        Boon Khai Ng <boon.khai.ng@...era.com>,
        Daniel Machon <daniel.machon@...rochip.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>, Furong Xu <0x1207@...il.com>,
        Jacob Keller <jacob.e.keller@...el.com>,
        Jakub Kicinski <kuba@...nel.org>,
        "Jan Petrous (OSS)" <jan.petrous@....nxp.com>,
        linux-arm-kernel@...ts.infradead.org,
        linux-stm32@...md-mailman.stormreply.com,
        Maxime Chevallier <maxime.chevallier@...tlin.com>,
        Maxime Coquelin <mcoquelin.stm32@...il.com>, netdev@...r.kernel.org,
        Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
        Yu-Chun Lin <eleanor15x@...il.com>
Subject: Re: [PATCH net-next 0/3] net: stmmac: phylink PCS conversion part 3
 (dodgy stuff)

On Mon, Nov 03, 2025 at 02:13:53PM +0200, Vladimir Oltean wrote:
> On Mon, Nov 03, 2025 at 11:43:23AM +0000, Russell King (Oracle) wrote:
> > On Mon, Nov 03, 2025 at 04:50:03PM +0530, Mohd Ayaan Anwar wrote:
> > > On Mon, Nov 03, 2025 at 12:48:20PM +0200, Vladimir Oltean wrote:
> > > > 
> > > > As Russell partially pointed out, there are several assumptions in the
> > > > Aquantia PHY driver and in phylink, three of them being that:
> > > > - rate matching is only supported for PHY_INTERFACE_MODE_10GBASER and
> > > >   PHY_INTERFACE_MODE_2500BASEX (thus not PHY_INTERFACE_MODE_SGMII)
> > > > - if phy_get_rate_matching() returns RATE_MATCH_NONE for an interface,
> > > >   pl->phy_state.rate_matching will also be RATE_MATCH_NONE when using
> > > >   that interface
> > > > - if rate matching is used, the PHY is configured to use it for all
> > > >   media speeds <= phylink_interface_max_speed(link_state.interface)
> > > > 
> > > > Those assumptions are not validated very well against the ground truth
> > > > from the PHY provisioning, so the next step would be for us to see that
> > > > directly.
> > > > 
> > > > Please turn this print from aqr_gen2_read_global_syscfg() into something
> > > > visible in dmesg, i.e. by replacing phydev_dbg() with phydev_info():
> > > > 
> > > > 		phydev_dbg(phydev,
> > > > 			   "Media speed %d uses host interface %s with %s\n",
> > > > 			   syscfg->speed, phy_modes(syscfg->interface),
> > > > 			   syscfg->rate_adapt == AQR_RATE_ADAPT_NONE ? "no rate adaptation" :
> > > > 			   syscfg->rate_adapt == AQR_RATE_ADAPT_PAUSE ? "rate adaptation through flow control" :
> > > > 			   syscfg->rate_adapt == AQR_RATE_ADAPT_USX ? "rate adaptation through symbol replication" :
> > > > 			   "unrecognized rate adaptation type");
> > > 
> > > Thanks. Looks like rate adaptation is only provisioned for 10M, which
> > > matches my observation where phylink passes the exact speeds for
> > > 100/1000/2500 but 1000 for 10M.
> > 
> > Hmm, I wonder what the PHY is doing for that then. stmmac will be
> > programmed to read the Cisco SGMII in-band control word, and use
> > that to determine whether symbol replication for slower speeds is
> > being used.
> > 
> > If AQR115C is indicating 10M in the in-band control word, but is
> > actually operating the link at 1G speed, things are not going to
> > work, and I would say the PHY is broken to be doing that. The point
> > of the SGMII in-band control word is to tell the MAC about the
> > required symbol replication on the link for transmitting the slower
> > data rates over the link.
> > 
> > stmmac unfortunately doesn't give access to the raw Cisco SGMII
> > in-band control word. However, reading register 0xf8 bits 31:16 for
> > dwmac4, or register 0xd8 bits 15:0 for dwmac1000 will give this
> > information. In that bitfield, bits 2:1 give the speed. 2 = 1G,
> > 1 = 100M, 0 = 10M.
> 
> It might be Linux who is forcing the AQR115C into the nonsensical
> behaviour of advertising 10M in the SGMII control word while
> simultanously forcing the PHY MII to operate at 1G with flow control
> for the 10M media speed.
> 
> We don't control the latter, but we do control the former:
> aqr_gen2_config_inband(), if given modes == LINK_INBAND_ENABLE, will
> enable in-band for all media speeds that use PHY_INTERFACE_MODE_SGMII.
> Regardless of how the PHY was provisioned for each media speed, and
> especially regardless of rate matching settings, this function will
> uniformly set the same in-band enabled/disabled setting for all media
> speeds using the same host interface.
> 
> If dwmac_integrated_pcs_inband_caps(), as per Russell's patch 1/3,
> reports LINK_INBAND_ENABLE | LINK_INBAND_DISABLE, and if
> aqr_gen2_inband_caps() also reports LINK_INBAND_ENABLE | LINK_INBAND_DISABLE,
> then we're giving phylink_pcs_neg_mode() all the tools it needs to shoot
> itself in the foot, and select LINK_INBAND_ENABLE.
> 
> The judgement call in the Aquantia PHY driver was mine, as documented in
> commit 5d59109d47c0 ("net: phy: aquantia: report and configure in-band
> autoneg capabilities"). The idea being that the configuration would have
> been unsupportable anyway given the question that the framework asks:
> "does the PHY use in-band for SGMII, or does it not?"
> 
> Assuming the configuration at 10Mbps wasn't always broken, there's only
> one way to know how it was supposed to work: more dumping of the initial
> provisioning, prior to our modification in aqr_gen2_config_inband().
> Ayaan, please re-print the same info with this new untested patch applied.
> I am going to assume that in-band autoneg isn't enabled in the unmodified
> provisioning, at least for 10M.
> 
> Russell's request for the integrated PCS status is also a good parallel
> confirmation that yes, we've entered a mode where the PHY advertises
> SGMII replication at 10M.

> From b91162e5dae8e20b477999c4f2fcdb98c219d663 Mon Sep 17 00:00:00 2001
> From: Vladimir Oltean <vladimir.oltean@....com>
> Date: Mon, 3 Nov 2025 14:03:55 +0200
> Subject: [PATCH] net: phy: aquantia: add inband setting to the
>  aqr_gen2_read_global_syscfg() print
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
> ---
>  drivers/net/phy/aquantia/aquantia_main.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
> index 41f3676c7f1e..f06b7b51bb7d 100644
> --- a/drivers/net/phy/aquantia/aquantia_main.c
> +++ b/drivers/net/phy/aquantia/aquantia_main.c
> @@ -839,6 +839,7 @@ static int aqr_gen2_read_global_syscfg(struct phy_device *phydev)
>  
>  	for (i = 0; i < AQR_NUM_GLOBAL_CFG; i++) {
>  		struct aqr_global_syscfg *syscfg = &priv->global_cfg[i];
> +		bool inband;
>  
>  		syscfg->speed = aqr_global_cfg_regs[i].speed;
>  
> @@ -849,6 +850,7 @@ static int aqr_gen2_read_global_syscfg(struct phy_device *phydev)
>  
>  		serdes_mode = FIELD_GET(VEND1_GLOBAL_CFG_SERDES_MODE, val);
>  		rate_adapt = FIELD_GET(VEND1_GLOBAL_CFG_RATE_ADAPT, val);
> +		inband = FIELD_GET(VEND1_GLOBAL_CFG_AUTONEG_ENA, val);
>  
>  		switch (serdes_mode) {
>  		case VEND1_GLOBAL_CFG_SERDES_MODE_XFI:
> @@ -896,12 +898,13 @@ static int aqr_gen2_read_global_syscfg(struct phy_device *phydev)
>  		}
>  
>  		phydev_dbg(phydev,
> -			   "Media speed %d uses host interface %s with %s\n",
> +			   "Media speed %d uses host interface %s with %s, inband %s\n",
>  			   syscfg->speed, phy_modes(syscfg->interface),
>  			   syscfg->rate_adapt == AQR_RATE_ADAPT_NONE ? "no rate adaptation" :
>  			   syscfg->rate_adapt == AQR_RATE_ADAPT_PAUSE ? "rate adaptation through flow control" :
>  			   syscfg->rate_adapt == AQR_RATE_ADAPT_USX ? "rate adaptation through symbol replication" :
> -			   "unrecognized rate adaptation type");
> +			   "unrecognized rate adaptation type",
> +			   str_enabled_disabled(inband));
>  	}
>  
>  	return 0;
> -- 
> 2.43.0
> 

Here are the logs when I boot up with a 10M link:

[   10.743044] Aquantia AQR115C stmmac-0:08: Media speed 10 uses host interface sgmii with rate adaptation through flow control, inband enabled
[   10.757965] Aquantia AQR115C stmmac-0:08: Media speed 100 uses host interface sgmii with no rate adaptation, inband enabled
[   10.769857] Aquantia AQR115C stmmac-0:08: Media speed 1000 uses host interface sgmii with no rate adaptation, inband enabled
[   10.781840] Aquantia AQR115C stmmac-0:08: Media speed 2500 uses host interface 2500base-x with no rate adaptation, inband disabled
[   10.794346] Aquantia AQR115C stmmac-0:08: Media speed 5000 uses host interface 10gbase-r with rate adaptation through flow control, inband disabled
[   10.808358] Aquantia AQR115C stmmac-0:08: Media speed 10000 uses host interface 10gbase-r with no rate adaptation, inband disabled
[   10.827242] qcom-ethqos 23040000.ethernet eth1: PHY stmmac-0:08 uses interfaces 4,23,27, validating 23
[   10.836812] qcom-ethqos 23040000.ethernet eth1:  interface 23 (2500base-x) rate match pause supports 0-7,9,13-14,47
[   10.836817] qcom-ethqos 23040000.ethernet eth1: PHY [stmmac-0:08] driver [Aquantia AQR115C] (irq=318)
[   10.836819] qcom-ethqos 23040000.ethernet eth1: phy: 2500base-x setting supported 0000000,00000000,00008000,000062ff advertising 0000000,00000000,00008000,000062ff
[   10.851865] qcom-ethqos 23040000.ethernet eth1: Enabling Safety Features
[   10.882611] qcom-ethqos 23040000.ethernet eth1: IEEE 1588-2008 Advanced Timestamp supported
[   10.895207] qcom-ethqos 23040000.ethernet eth1: registered PTP clock
[   10.902334] qcom-ethqos 23040000.ethernet eth1: configuring for phy/2500base-x link mode
[   10.910654] qcom-ethqos 23040000.ethernet eth1: major config, requested phy/2500base-x
[   10.918790] qcom-ethqos 23040000.ethernet eth1: has pcs = true
[   10.924787] qcom-ethqos 23040000.ethernet eth1: interface 2500base-x inband modes: pcs=01 phy=00
[   10.933809] qcom-ethqos 23040000.ethernet eth1: major config, active phy/outband/2500base-x
[   10.942388] qcom-ethqos 23040000.ethernet eth1: phylink_mac_config: mode=phy/2500base-x/none adv=0000000,00000000,00000000,00000000 pause=00
[   10.966344] qcom-ethqos 23040000.ethernet eth1: phy link down 2500base-x/Unknown/Unknown/none/off/nolpi
[   12.819779] qcom-ethqos 23040000.ethernet eth1: phy link up sgmii/10Mbps/Half/pause/off/nolpi
[   12.825947] stmmac_pcs: Link Down
[   12.829539] qcom-ethqos 23040000.ethernet eth1: major config, requested phy/sgmii
[   12.831998] stmmac_pcs: Link Down
[   12.839683] qcom-ethqos 23040000.ethernet eth1: has pcs = true
[   12.843123] stmmac_pcs: Link Down
[   12.849121] qcom-ethqos 23040000.ethernet eth1: interface sgmii inband modes: pcs=03 phy=03
[   12.852546] stmmac_pcs: Link Down
[   12.861109] qcom-ethqos 23040000.ethernet eth1: major config, active phy/outband/sgmii
[   12.864535] stmmac_pcs: Link Down
[   12.872724] qcom-ethqos 23040000.ethernet eth1: phylink_mac_config: mode=phy/sgmii/pause adv=0000000,00000000,00000000,00000000 pause=00
[   12.876094] stmmac_pcs: Link Down
[   12.891394] qcom-ethqos 23040000.ethernet eth1: ethqos_configure_sgmii : Speed = 1000
[   12.892094] stmmac_pcs: Link Down
[   12.900109] dwmac: PCS configuration changed from phylink by glue, please report: 0x00040000 -> 0x00041200
[   12.903555] stmmac_pcs: Link Up
[   12.913473] qcom-ethqos 23040000.ethernet eth1: Link is Up - 10Mbps/Half - flow control off
[   12.916679] stmmac_pcs: Link Down
[   12.928659] stmmac_pcs: ANE process completed
[   12.933133] stmmac_pcs: Link Up

Although unrelated, I found it weird that the link comes up in half
duplex mode for 10M. To enable full duplex, I have to manually do it via
ethtool. I will try to connect a different link partner tomorrow, just
to rule out any issues on the other end.

	Ayaan


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ