lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aNFqYPLP2igudMq2@shell.armlinux.org.uk>
Date: Mon, 22 Sep 2025 16:25:20 +0100
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Janpieter Sollie <janpieter.sollie@...elmail.de>
Cc: Andrew Lunn <andrew@...n.ch>, netdev@...r.kernel.org,
	Heiner Kallweit <hkallweit1@...il.com>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>
Subject: Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)

On Mon, Sep 22, 2025 at 04:30:56PM +0200, Janpieter Sollie wrote:
> Based on my mails, I can certainly see why you're thinking this way.
> I have no idea what goes wrong anywhere between me making a modification in
> the mdio.c file -> i2c code -> ... -> SFP phy.
> I'm curious what goes wrong, notice the 3 dots in between,
> I know there's a pca9545 muxer in in there further complicating it, but that's about it.
> 
> Long story short: should I somehow try to test the reliability of something else?

What you have in these setups is:

1. The I2C bus from the host to the SFP module pins. On the SFP module
   is an EEPROM at address 0x50 which contains some useful, some not so
   useful identification of the module.

2. Sometimes there is a PHY at 0x56, which is normally a Marvell
   88E1111 which was designed for use on SFPs, and has not only the
   conventional MDIO bus connectivity, but also supports I2C as well.

3. Some baseT modules, the PHY is not accessible.

4. Others have a microcontroller on them - so far identified some with
   an Arm Cortex-M controller, but others have an 8051-based controller
   to implement the "Rollball" protocol.

So, in the case of Rollball protocol modules, one is at the mercy of
the microcontroller receiving the I2C transactions, then accessing the
PHY over MDIO, and then responding appropriately. Given that there are
two different microcontrollers used for this task, I wouldn't be
surprised if there were numerous different firmwares running on them
of varying quality and efficiency.

I would suggest your module is taking excessively long to respond for
_some_ accesses. Maybe the controller isn't merely converting the
Rollball protocol to MDIO, but is doing other PHY manipulation as well,
e.g. emulating some functionality.

It may be interesting to work out whether it is a specific register or
set of registers that need longer access, and augment our knowledge
about what is going on with this stuff.

Ultimately yes, we likely have no option but to increase the timeout,
and to do that I suggest simply increasing the number of loops -
having the approx. 20ms delay between each attempt doesn't stress
anything.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ