[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200110114433.GZ25745@shell.armlinux.org.uk>
Date: Fri, 10 Jan 2020 11:44:33 +0000
From: Russell King - ARM Linux admin <linux@...linux.org.uk>
To: ѽ҉ᶬḳ℠ <vtol@....net>
Cc: Andrew Lunn <andrew@...n.ch>, netdev@...r.kernel.org
Subject: Re: [drivers/net/phy/sfp] intermittent failure in state machine
checks
On Fri, Jan 10, 2020 at 09:50:00AM +0000, ѽ҉ᶬḳ℠ wrote:
> On 10/01/2020 09:27, Russell King - ARM Linux admin wrote:
> > On Thu, Jan 09, 2020 at 11:50:14PM +0000, ѽ҉ᶬḳ℠ wrote:
> > > On 09/01/2020 23:10, Russell King - ARM Linux admin wrote:
> > > > Please don't use mii-tool with SFPs that do not have a PHY; the "PHY"
> > > > registers are emulated, and are there just for compatibility. Please
> > > > use ethtool in preference, especially for SFPs.
> > > Sure, just ethtool is not much of help for this particular matter, all there
> > > is ethtool -m and according to you the EEPROM dump is not to be relied on.
> > How about just "ethtool eth2" ?
>
> Settings for eth2:
> Supported ports: [ TP ]
> Supported link modes: 1000baseX/Full
> Supported pause frame use: Symmetric
> Supports auto-negotiation: Yes
> Supported FEC modes: Not reported
> Advertised link modes: 1000baseX/Full
> Advertised pause frame use: Symmetric
> Advertised auto-negotiation: Yes
> Advertised FEC modes: Not reported
> Speed: 1000Mb/s
> Duplex: Full
> Port: Twisted Pair
> PHYAD: 0
> Transceiver: internal
> Auto-negotiation: on
> MDI-X: Unknown
> Supports Wake-on: d
> Wake-on: d
> Link detected: yes
That looks fine.
> > > > CONFIG_DEBUG_GPIO is not the same as having debugfs support enabled.
> > > > If debugfs is enabled, then gpiolib will provide the current state
> > > > of gpios through debugfs. debugfs is normally mounted on
> > > > /sys/kernel/debug, but may not be mounted by default depending on
> > > > policy. Looking in /proc/filesystems will tell you definitively
> > > > whether debugfs is enabled or not in the kernel.
> > > debugsfs is mounted but ls -af /sys/kernel/debug/gpio only producing
> > > (oddly):
> > >
> > > /sys/kernel/debug/gpio
> > Try "cat /sys/kernel/debug/gpio"
>
> gpiochip2: GPIOs 504-511, parent: i2c/8-0071, pca9538, can sleep:
> gpio-504 ( |tx-fault ) in lo IRQ
> gpio-505 ( |tx-disable ) out lo
> gpio-506 ( |rate-select0 ) in lo
> gpio-507 ( |los ) in lo IRQ
> gpio-508 ( |mod-def0 ) in lo IRQ
Which is also indicating everything is correct. When the problem
occurs, check the state of the signals again as close as possible
to the event - it depends how long the transceiver keeps it
asserted. You will probably find tx-fault is indicating
"in hi IRQ".
> Meantime Allnet responded, which basically sums up to (blame ping pong - it
> is not me but go and look there instead...)
>
> - driver support is not being handled by Allnet but by Metanoia, latter
> being designer and manufacturer
> - Allnet does not have the buying power to persuade Metanoia to look into
> the matter
... which is pretty standard; no one will rework their SFP unless
they fear their sales will be severely impacted by the issue.
> - it would appear that SFP.C is trying to communicate with Fiber-GBIC and
> fails since the signal reports may not be 100% compatible
That's a fun claim, but note carefully the wording "may" which implies
some uncertainty in the statement.
Let's look at the wording of the GBIC (SFF-8053) and SFP (INF-8074 -
SFP MSA) documents. The wording for the "fault recovery" is identical
between the two, which concerns what happens when TX_FAULT is asserted
and how to recover from that.
Concerning the implementation of TX_FAULT, SFF-8053 states:
If no transmitter safety circuitry is implemented, the TX_FAULT signal
may be tied to its negated state.
but then says later in the document:
If TX_FAULT is not implemented, the signal shall be held to the low
state by the GBIC.
Meanwhile, INF-8074 similarly states:
If no transmitter safety circuitry is implemented, the TX_FAULT signal
may be tied to its negated state.
but later on has a similar statement:
TX_FAULT shall be implemented by those module definitions of SFP
transceiver supporting safety circuitry. If TX_FAULT is not
implemented, the signal shall be held to the low state by the SFP
transceiver.
"shall" in both cases is stronger than "may". So, there seems to be
little difference between the GBIC and SFP usage of this signal.
Their claim is that sfp.c implements the older GBIC style of signal
reports. My counter-claim is that (a) sfp.c is written to the SFP MSA
and not the GBIC standard, and (b) there is no difference as far as the
TX_FAULT signal is concerned between the GBIC standard and the SFP MSA.
But... it doesn't matter that much, there's a module out there (and it
isn't the only one) which does "funny stuff" with its TX_FAULT signal.
Either we decide we want to support it and implement a quirk, or we
decide we don't want to support it.
There is an option bit in the EEPROM that is supposed to indicate
whether the module supports TX_FAULT, but, as you can guess, there are
problems with using that, as:
1) there are a lot of modules, particularly optical modules, that
implement TX_FAULT correctly but don't set the option bit to say
that they support the signal.
2) the other module I'm aware of that does "funny stuff" with its
TX_FAULT signal does have the TX_FAULT option bit set.
So, the option bit is completely untrustworthy and, therefore, is
meaningless (which is why we don't use it.)
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
Powered by blists - more mailing lists