lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1e9c7da5-4ace-476d-8a4b-b05bc44eedf1@kabelmail.de>
Date: Sat, 20 Sep 2025 14:34:38 +0200
From: Janpieter Sollie <janpieter.sollie@...elmail.de>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Andrew Lunn <andrew@...n.ch>, netdev@...r.kernel.org,
 Heiner Kallweit <hkallweit1@...il.com>, "David S. Miller"
 <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>
Subject: Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)

Op 20/09/2025 om 13:18 schreef Russell King (Oracle):
> On Sat, Sep 20, 2025 at 12:00:50PM +0200, Janpieter Sollie wrote:
>> Op 19/09/2025 om 19:04 schreef Andrew Lunn:
>>> On Fri, Sep 19, 2025 at 03:52:55PM +0200, Janpieter Sollie wrote:
>>>> Hello everyone,
>>> Please ensure you Cc: the correct Maintainers.
>>>
>>> ./scripts/get_maintainer.pl drivers/net/phy/sfp.c
>>> Russell King <linux@...linux.org.uk> (maintainer:SFF/SFP/SFP+ MODULE SUPPORT)
>>> Andrew Lunn <andrew@...n.ch> (maintainer:ETHERNET PHY LIBRARY)
>>> Heiner Kallweit <hkallweit1@...il.com> (maintainer:ETHERNET PHY LIBRARY)
>>> "David S. Miller" <davem@...emloft.net> (maintainer:NETWORKING DRIVERS)
>>> Eric Dumazet <edumazet@...gle.com> (maintainer:NETWORKING DRIVERS)
>>> Jakub Kicinski <kuba@...nel.org> (maintainer:NETWORKING DRIVERS)
>>> Paolo Abeni <pabeni@...hat.com> (maintainer:NETWORKING DRIVERS)
>>> netdev@...r.kernel.org (open list:SFF/SFP/SFP+ MODULE SUPPORT)
>>> linux-kernel@...r.kernel.org (open list)
>> Done, sorry, this is my first post here
>>>> I tested a SFP module where the i2c bus is "unstable" at best.
>>> Please tell us more about the hardware.
>>>
>>> Also, what speed do you have the I2C bus running at? Have you tried
>>> different clock-frequency values to slow down the I2C bus? Have you
>>> checked the pull-up resistors? I2C problems are sometimes due to too
>>> strong pull-ups.
>> The hardware is a bananapi R4 2xSFP using a MT7988a SoC.
>> The SFP+ module is a RJ45 rollball module using a AQR113C phy, but needs a
>> quirk in sfp.c (added below)
>> I'm not a i2c expert at all,
>> but about the i2c bus speed, the SFP cage seems to be behind a muxer, not a i2c root.
>> I could not find anything about i2c bus speed in /proc or /sys, maybe it's impossible to tell?
>>
>> The dtsi or dtso files do not mention anything about bus speeds, so I honestly do not know.
> As you have not include the author of the SFP support (me) in your
> initial email, and have not provided a repeat of the description,
> I'm afraid I have no idea what the issue is that you're encountering.
>
> Thanks.
>

Yes indeed, I see it too.

Hereby the original post:

======================

Hello everyone,

I tested a SFP module where the i2c bus is "unstable" at best.
different i2c timeouts occured, resulting in a "phy not detected" error message.
A simple (but not revealing a lot) stack dump during probe::

 > mdio_i2c_alloc (drivers/net/mdio/mdio-i2c.c:268) mdio_i2c
 > mdio_i2c_alloc (drivers/net/mdio/mdio-i2c.c:316) mdio_i2c
 > __mdiobus_c45_read (drivers/net/phy/mdio_bus.c:992)
 > mdiobus_c45_read (drivers/net/phy/mdio_bus.c:1133)
 > get_phy_c45_ids (drivers/net/phy/phy_device.c:947)
 > get_phy_device (drivers/net/phy/phy_device.c:1054)
 > init_module (drivers/net/phy/sfp.c:1820) sfp
 > cleanup_module (drivers/net/phy/sfp.c:1956 drivers/net/phy/sfp.c:2667 
drivers/net/phy/sfp.c:2748) sfp
 > cleanup_module (drivers/net/phy/sfp.c:2760 drivers/net/phy/sfp.c:2892) sfp
 > process_one_work (./arch/arm64/include/asm/jump_label.h:32 ./include/linux/jump_label.h:207)
 > worker_thread (kernel/workqueue.c:3304 (discriminator 2) kernel/workqueue.c:3391 
(discriminator 2))
 > kthread (kernel/kthread.c:389)
 > ret_from_fork (arch/arm64/kernel/entry.S:863)

I noticed a few hard-coded numbers in i2c_rollball_mii_pol(), which is always suspicious.
In order to lower the stress on the i2c bus, I made the following patch.
is it the best way to "not-stress-sensitive-devices"?
Will it cause a performance regression on some other SFP cages?

Eric Woudstra told me another option was to add a few tries, increasing i = 10,
If the issue isn't the device itself, but the stress on the i2c bus is too high, it may not be a 
real solution.

A good question may be: is this approach sufficient to close the gap between
"high performance" equipment having a stable i2c bus and they do not want to wait,
and embedded equipment (the device I tested on was a BPI-R4) where every milliwatt counts?
Should this be fixed at another point in the initialization process (eg: not probing 
ridiculously all phy ids)?

Thanks,

Janpieter Sollie

--- a/drivers/net/mdio/mdio-i2c.c       2025-09-19 14:08:41.285357818 +0200
+++ b/drivers/net/mdio/mdio-i2c.c       2025-09-19 14:10:24.962796149 +0200
@@ -253,7 +253,7 @@ static int i2c_rollball_mii_poll(struct mii_bus *bus, int bus_addr, u8 *buf,
          */
         i = 10;
         do {
-               msleep(20);
+               msleep(20+(10*(10-i)));

                 ret = i2c_transfer_rollball(i2c, msgs, ARRAY_SIZE(msgs));
                 if (ret)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ