lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aOYXEFf1fVK93QeS@FUE-ALEWI-WINX>
Date: Wed, 8 Oct 2025 09:47:28 +0200
From: Alexander Wilhelm <alexander.wilhelm@...termo.com>
To: Vladimir Oltean <vladimir.oltean@....com>
Cc: "Russell King (Oracle)" <linux@...linux.org.uk>,
        Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: Aquantia PHY in OCSGMII mode?

On Tue, Oct 07, 2025 at 05:08:19PM +0300, Vladimir Oltean wrote:
> Hi Alexander,
[...]
> Sorry for the delay. What you have found are undoubtebly two major bugs,
> causing the Lynx PCS to operate in undefined behaviour territory.
> Nonetheless, while your finding has helped me discover many previously
> unknown facts about the hardware IP, I still cannot replicate exactly
> your reported behaviour. In order to fully process things, I would like
> to ask a few more clarification questions.

Sure.

> Is your U-Boot implementation based on NXP's dtsec_configure_serdes()?
> https://urldefense.com/v3/__https://github.com/u-boot/u-boot/blob/master/drivers/net/fm/eth.c*L57__;Iw!!I9LPvj3b!An_LkChNHfp-qG89smQddcR4wAXVZC8Bt69TrktvBZg6BJNUrhH52LbgCRpu9sduQCpqfTfwsnXf8UB6VdHiAOeWo73T1jQe$ 

Unfortunately, I am working with an older U-Boot version v2016.07. However,
the bug I fixed was not part of the official U-Boot codebase, it was
introduced by our team:

    value = PHY_SGMII_IF_MODE_SGMII;
    value |= PHY_SGMII_IF_MODE_AN;

I added the missing `if` condition as follows:

    if (!sgmii_2500) {
        value = PHY_SGMII_IF_MODE_SGMII;
        value |= PHY_SGMII_IF_MODE_AN;
    }

With the official U-Boot codebase I don't have a ping at none of the
speeds:

    value = PHY_SGMII_IF_MODE_SGMII;
    if (!sgmii_2500)
        value |= PHY_SGMII_IF_MODE_AN;

> Why would U-Boot set IF_MODE_SGMII_EN | IF_MODE_USE_SGMII_AN only when
> the AQR115 resolves only to 100M, but not in the other cases (which do
> not have this problem)? Or does it do it irrespective of resolved media
> side link speed? Simply put: what did the code that you fixed up look like?

In our implementation, the SGMII flags were always set in U-Boot,
regardless of the negotiated link speed. My assumption is that the SGMII
mode configuration results in a behavior where only a 100M link applies the
10x symbol replication, while 1G does not. For a 2.5G link, the behavior
ends up being the same as 1G, since there is no actual SGMII mode for 2.5G.

> With the U-Boot fix reverted, could you please replicate the broken
> setup with AQR115 linking at 100Mbps, and add the following function in
> Linux drivers/pcs-lynx.c?
> 
> static void lynx_pcs_debug(struct mdio_device *pcs)
> {
> 	int bmsr = mdiodev_read(pcs, MII_BMSR);
> 	int bmcr = mdiodev_read(pcs, MII_BMCR);
> 	int adv = mdiodev_read(pcs, MII_ADVERTISE);
> 	int lpa = mdiodev_read(pcs, MII_LPA);
> 	int if_mode = mdiodev_read(pcs, IF_MODE);
> 
> 	dev_info(&pcs->dev, "BMSR 0x%x, BMCR 0x%x, ADV 0x%x, LPA 0x%x, IF_MODE 0x%x\n", bmsr, bmcr, adv, lpa, if_mode);
> }
> 
> and call it from:
> 
> static void lynx_pcs_get_state(struct phylink_pcs *pcs, unsigned int neg_mode,
> 			       struct phylink_link_state *state)
> {
> 	struct lynx_pcs *lynx = phylink_pcs_to_lynx(pcs);
> 
> 	lynx_pcs_debug(lynx->mdio); // <- here
> 
> 	switch (state->interface) {
> 	...
> 
> With this, I would like to know:
> (a) what is the IF_MODE register content outside of the IF_MODE_SGMII_EN
>     and IF_MODE_USE_SGMII_AN bits.
> (b) what is the SGMII code word advertised by the AQR115 in OCSGMII mode.
> 
> Then if you could replicate this test for 1Gbps medium link speed, it
> would be great.

For now, I have reverted both the U-Boot and kernel fixes and added debug
outputs for further analysis. Unfortunately the function
`lynx_pcs_get_state` is never called in my kernel code. Therefore I put the
debug function into `lynx_pcs_config`. Here is the output:

    mdio_bus 0x0000000ffe4e5000:00: BMSR 0x29, BMCR 0x1140, ADV 0x4001, LPA 0xdc01, IF_MODE 0x3

I hope it'll help to analyze the problem further.


Best regards
Alexander Wilhelm

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ