[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aF07I-QtyH8hbupf@pengutronix.de>
Date: Thu, 26 Jun 2025 14:20:51 +0200
From: Oleksij Rempel <o.rempel@...gutronix.de>
To: Lucas Stach <l.stach@...gutronix.de>
Cc: Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
kernel@...gutronix.de, Russell King <linux@...linux.org.uk>,
Maxime Chevallier <maxime.chevallier@...tlin.com>
Subject: Re: [PATCH net-next v2 1/1] phy: micrel: add Signal Quality
Indicator (SQI) support for KSZ9477 switch PHYs
On Thu, Jun 26, 2025 at 07:11:38AM +0200, Oleksij Rempel wrote:
> On Wed, Jun 25, 2025 at 08:06:32PM +0200, Lucas Stach wrote:
> > Hi Oleksij,
> >
> > Am Mittwoch, dem 25.06.2025 um 14:41 +0200 schrieb Oleksij Rempel:
> > > Add support for the Signal Quality Index (SQI) feature on KSZ9477 family
> > > switches. This feature provides a relative measure of receive signal
> > > quality.
> > >
> > > The KSZ9477 PHY provides four separate SQI values for a 1000BASE-T link,
> > > one for each differential pair (Channel A-D). Since the current get_sqi
> > > UAPI only supports returning a single value per port, this
> > > implementation reads the SQI from Channel A as a representative metric.
> >
> > I wonder if it wouldn't be more useful to report the worst SQI from all
> > the channels instead.
>
> It was my first idea too, just to report the worst SQI from all
> channels. But this makes it impossible to report SQI for each pair
> later. If we ever want to support SQI per pair, the current code would
> suddenly start to show only SQI for pair A, not the worst one, so the
> SQI interface would change meaning without warning.
>
> There is another problem if we want to extend the SQI UAPI for per-pair
> support: with 100Mbit/s links, we can't know which pair is used. The PHY
> reports SQI only for the RX pair, which can change depending on MDI-X
> resolution, and with auto MDI-X mode, this PHY doesn't tell us which
> pair it is.
>
> That means, at this point, we have hardware which in some modes can't
> provide pair-related information. So, it is better to keep the already
> existing UAPI explicitly per link instead of per pair. This matches the
> current hardware limits and avoids confusion for users and developers.
> If we want per-pair SQI in the future, the API must handle these cases
> clearly.
...
> > This ends up spending a sizable amount of time just spinning the CPU to
> > collect the samples for the averaging. Given that only very low values
> > seem to indicate a working link, I wonder how significant the
> > fluctuations in reported link quality are in reality. Is it really
> > worth spending 120us of CPU time to average those values?
> >
> > Maybe a running average updated with a new sample each time this
> > function is called would be sufficient?
>
> Hm. Good point. I'l try it. We already have proper interface for this
> case :)
After some more testing with a signal generator, I started to doubt the
usability of our SQI hardware implementation in this case.
The problem is: the signal issue can only be detected if data transfer is
ongoing - more specifically, if data is being received (for example, when
running iperf). The SQI register on this hardware is updated every 3 µs. This
means if any data is received within this window, we can detect noise on the
wire. But if there is no transfer, the SQI register always shows perfect link
quality.
On the other hand, as long as noise or a sine wave is injected into the twisted
pair, it is possible to see bandwidth drops in iperf, but no other error
counters indicate problems. Even the RxErr counter on the PHY side stays silent
(it seems to detect only other kinds of errors). If SQI is actively polled
during this time, it will show a worse value - so in general, SQI seems to be
usable.
At an early stage of the SQI implementation, I didn’t have a strong opinion
about the need to differentiate these interfaces. But now, based on practical
experience, I see that the difference is very important.
It looks like we have two types of SQI hardware implementations:
- Hardware which provides the worst value since last read
- Hardware which automatically updates the value every N microseconds
- Hardware which provides both values
Both types are recommended by the Open Alliance as:
- "worst case SQI value since last read"
- "current SQI value"
My question is: do we really need both interfaces? The "current SQI value"
seems impractical if it only reflects quality during active data transfer.
What do you think?
Best Regards,
Oleksij
---
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
Powered by blists - more mailing lists