[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXjSbu6L6ICYOPiJ@oss.qualcomm.com>
Date: Tue, 27 Jan 2026 20:27:50 +0530
From: Mohd Ayaan Anwar <mohd.anwar@....qualcomm.com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>,
Alexandre Torgue <alexandre.torgue@...s.st.com>,
Andrew Lunn <andrew+netdev@...n.ch>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Konrad Dybcio <konrad.dybcio@....qualcomm.com>,
linux-arm-kernel@...ts.infradead.org, linux-arm-msm@...r.kernel.org,
linux-phy@...ts.infradead.org,
linux-stm32@...md-mailman.stormreply.com,
Maxime Coquelin <mcoquelin.stm32@...il.com>,
Neil Armstrong <neil.armstrong@...aro.org>, netdev@...r.kernel.org,
Paolo Abeni <pabeni@...hat.com>, Vinod Koul <vkoul@...nel.org>
Subject: Re: [PATCH net-next v2 00/14] net: stmmac: SerDes, PCS, BASE-X, and
inband goodies
On Fri, Jan 23, 2026 at 09:32:21PM +0000, Russell King (Oracle) wrote:
>
> and the failing store is the one for that last line of C code - in
> other words, pcs = NULL.
>
> This means that mac_select_pcs() returned NULL when being asked
> "which PCS should be used for 2500base-X" ?
>
> This suggests that the SerDes detection of support for 2500BASE-X
> isn't working, meaning that stmmac_mac_select_pcs() ends up returning
> NULL, rather than &priv->integrated_pcs->pcs.
>
> That would only happen if:
>
> /* Only allow 2500Base-X if the SerDes has support. */
> ret = dwmac_serdes_validate(priv, PHY_INTERFACE_MODE_2500BASEX);
> if (ret == 0)
> __set_bit(PHY_INTERFACE_MODE_2500BASEX,
> spcs->pcs.supported_interfaces);
>
> fails, meaning we don't set that interface mode for the PCS.
> dwmac_serdes_validate() calls phy_validate() for PHY_MODE_ETHERNET
> with the PHY interface mode as the sub mode.
>
> Patch 3 adds the required methods to phy-qcom-sgmii-eth.c to allow
> phy_validate() to indicate whether this is supported or not:
>
> .validate = qcom_dwmac_sgmii_phy_validate,
>
> and its implementation is:
>
> int ret = qcom_dwmac_sgmii_phy_speed(mode, submode);
>
> return ret < 0 ? ret : 0;
>
> where qcom_dwmac_sgmii_phy_speed() is:
>
> if (mode != PHY_MODE_ETHERNET)
> return -EINVAL;
>
> if (submode == PHY_INTERFACE_MODE_SGMII ||
> submode == PHY_INTERFACE_MODE_1000BASEX)
> return SPEED_1000;
>
> if (submode == PHY_INTERFACE_MODE_2500BASEX)
> return SPEED_2500;
>
> return -EINVAL;
>
> So, this should be returning a positive integer (SPEED_2500), which
> should cause phy_validate(serdes, PHY_MODE_ETHERNET,
> PHY_INTERFACE_MODE_2500BASEX, NULL) to return success (zero). That
> should result in PHY_INTERFACE_MODE_2500BASEX being set in
> spcs->pcs.supported_interfaces, and thus &priv->integrated_pcs->pcs
> being returned for PHY_INTERFACE_MODE_2500BASEX.
>
> Is the particular hardware you're running this oopsing test on not
> using a SerDes PHY? If that's the case, how does it switch between
> 2.5Gbps and 1Gbps data rate on the SerDes?
>
It is using the same SerDes PHY (qcom_dwmac_sgmii_phy_driver).
I added additional debug prints, and I think the crash is due to
BMSR_ESTATEN not being set in GMAC_AN_STATUS.
During pcs_init, BIT(8) of GMAC_AN_STATUS is 0:
[ 7.985913] [DBG] GMAC_AN_STATUS = 8
Therefore, this check:
if (readl(spcs->base + GMAC_AN_STATUS) & BMSR_ESTATEN) {
__set_bit(PHY_INTERFACE_MODE_1000BASEX,
spcs->pcs.supported_interfaces);
/* Only allow 2500Base-X if the SerDes has support. */
ret = dwmac_serdes_validate(priv, PHY_INTERFACE_MODE_2500BASEX);
if (ret == 0)
__set_bit(PHY_INTERFACE_MODE_2500BASEX,
spcs->pcs.supported_interfaces);
}
fails, and PHY_INTERFACE_MODE_2500BASEX never gets set in
pcs.supported_interfaces. Pardon my naivete, but does the
BMSR_ESTATEN bit not being set break some standard?
If I remove the check, the NULL pointer dereference is not observed
anymore. Although the SerDes link is still unstable.
I also tried enabling comma detect during dwmac_integrated_pcs_config,
but I am still seeing the Tx timeouts. I remember that when I had
tested the patches in October (without the SerDes driver changes),
the link state used to flap, but the data path became functional
after the link stabilized.
Ayaan
---
Full Logs (Speed Change: 1G -> 2.5G)
[ 244.817499] qcom-ethqos 23040000.ethernet eth1: pcs link down
[ 257.066210] dwmac: PCS configuration changed from phylink by glue, please report: 0x00040000 -> 0x00041000
[ 257.076143] dwmac: ANE 0 -> 1
[ 257.079668] qcom-ethqos 23040000.ethernet eth1: Link is Up - 1Gbps/Full - flow control off
[ 264.260852] qcom-ethqos 23040000.ethernet eth1: NETDEV WATCHDOG: CPU: 7: transmit queue 3 timed out 5472 ms
[ 264.271394] qcom-ethqos 23040000.ethernet eth1: Reset adapter.
[ 264.280493] qcom-ethqos 23040000.ethernet eth1: phy link down 2500base-x/Unknown/Unknown/none/off/nolpi
[ 264.842309] qcom-ethqos 23040000.ethernet eth1: Timeout accessing MAC_VLAN_Tag_Filter
[ 264.850391] qcom-ethqos 23040000.ethernet eth1: failed to kill vid 0081/0
[ 264.857547] qcom-ethqos 23040000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-0
[ 264.865795] qcom-ethqos 23040000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-1
[ 264.873939] qcom-ethqos 23040000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-2
[ 264.882111] qcom-ethqos 23040000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-3
[ 265.792807] qcom-ethqos 23040000.ethernet eth1: PHY stmmac-0:08 uses interfaces 4,23,27, validating 23
[ 265.802389] [DBG] stmmac_mac_select_pcs - testing for 23 (2500base-x) on priv->integrated_pcs->pcs.supported_interfaces = 4
[ 265.802399] qcom-ethqos 23040000.ethernet eth1: interface 23 (2500base-x) rate match pause supports 0-7,9,13-14,47
[ 265.824572] qcom-ethqos 23040000.ethernet eth1: PHY [stmmac-0:08] driver [Aquantia AQR115C] (irq=334)
[ 265.834055] qcom-ethqos 23040000.ethernet eth1: phy: sgmii setting supported 00000000,00000000,00008000,000062ff advertising 00000000,00000000,00008000,000062ff
[ 265.852828] [DBG] qcom_dwmac_sgmii_phy_speed called with mode=15, submode=4
[ 265.852837] [DBG] qcom_dwmac_sgmii_phy_validate - qcom_dwmac_sgmii_phy_speed returned 1000
[ 265.868580] qcom-ethqos 23040000.ethernet eth1: Enabling Safety Features
[ 265.884237] qcom-ethqos 23040000.ethernet eth1: IEEE 1588-2008 Advanced Timestamp supported
[ 265.893946] qcom-ethqos 23040000.ethernet eth1: registered PTP clock
[ 265.900561] qcom-ethqos 23040000.ethernet eth1: configuring for phy/sgmii link mode
[ 265.908451] qcom-ethqos 23040000.ethernet eth1: major config, requested phy/sgmii
[ 265.916159] [DBG] stmmac_mac_select_pcs - testing for 4 (sgmii) on priv->integrated_pcs->pcs.supported_interfaces = 4
[ 265.916166] qcom-ethqos 23040000.ethernet eth1: interface sgmii inband modes: pcs=03 phy=03
[ 265.935652] qcom-ethqos 23040000.ethernet eth1: major config, active phy/outband/sgmii
[ 265.943795] qcom-ethqos 23040000.ethernet eth1: phylink_mac_config: mode=phy/sgmii/none adv=00000000,00000000,00000000,00000000 pause=00
[ 265.956407] [DBG] qcom_dwmac_sgmii_phy_speed called with mode=15, submode=4
[ 265.956408] [DBG] qcom_dwmac_sgmii_phy_set_mode - qcom_dwmac_sgmii_phy_speed returned 1000
[ 265.976997] qcom-ethqos 23040000.ethernet eth1: phy link down 2500base-x/Unknown/Unknown/none/off/nolpi
[ 270.556001] qcom-ethqos 23040000.ethernet eth1: phy link up 2500base-x/2.5Gbps/Full/none/off/nolpi
[ 270.567649] qcom-ethqos 23040000.ethernet eth1: major config, requested phy/2500base-x
[ 270.575823] [DBG] stmmac_mac_select_pcs - testing for 23 (2500base-x) on priv->integrated_pcs->pcs.supported_interfaces = 4
[ 270.575831] qcom-ethqos 23040000.ethernet eth1: mac_select_pcs returned NULL
[ 270.594521] qcom-ethqos 23040000.ethernet eth1: interface 2500base-x inband modes: pcs=00 phy=00
[ 270.603554] qcom-ethqos 23040000.ethernet eth1: major config, active phy/outband/2500base-x
[ 270.612286] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
Powered by blists - more mailing lists