[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7ebee7c5-4bf3-134d-bc57-ea71e0bdfc60@gmx.net>
Date: Thu, 9 Jan 2020 19:42:27 +0000
From: ѽ҉ᶬḳ℠ <vtol@....net>
To: Russell King - ARM Linux admin <linux@...linux.org.uk>
Cc: Andrew Lunn <andrew@...n.ch>, netdev@...r.kernel.org
Subject: Re: [drivers/net/phy/sfp] intermittent failure in state machine
checks
On 09/01/2020 19:01, ѽ҉ᶬḳ℠ wrote:
> On 09/01/2020 17:43, Russell King - ARM Linux admin wrote:
>> On Thu, Jan 09, 2020 at 05:35:23PM +0000, ѽ҉ᶬḳ℠ wrote:
>>> Thank you for the extensive feedback and explanation.
>>>
>>> Pardon for having mixed up the semantics on module specifications
>>> vs. EEPROM
>>> dump...
>>>
>>> The module (chipset) been designed by Metanoia, not sure who is the
>>> actual
>>> manufacturer, and probably just been branded Allnet.
>>> The designer provides some proprietary management software (called
>>> EBM) to
>>> their wholesale buyers only
>> I have one of their early MT-V5311 modules, but it has no accessible
>> EEPROM, and even if it did, it would be of no use to me being
>> unapproved for connection to the BT Openreach network. (BT SIN 498
>> specifies non-standard power profile to avoid crosstalk issues with
>> existing ADSL infrastructure, and I believe they regularly check the
>> connected modem type and firmware versions against an approved list.)
>>
>> I haven't noticed the module I have asserting its TX_FAULT signal,
>> but then its RJ45 has never been connected to anything.
>>
>
> The curious (and sort of inexplicable) thing is that the module in
> general works, i.e. at some point it must pass the sm checks or
> connectivity would be failing constantly and thus the module being
> generally unusable.
>
> The reported issues however are intermittent, usually reliably
> reproducible with
>
> ifdown <iface> && ifup <iface>
>
> or rebooting the router that hosts the module.
>
> If some times passes, not sure but seems in excess of 3 minutes,
> between ifdown and ifup the sm checks mostly are not failing.
> It somehow "feels" that the module is storing some link signal
> information in a register which does not suit the sm check routine and
> only when that register clears the sm check routine passes and
> connectivity is restored.
> ____
>
> Since there are probably other such SFP modules, xDSL and g.fast, out
> there that do not provide laser safety circuitry by design (since not
> providing connectivity over fibre) would it perhaps not make sense to
> try checking for the existence of laser safety circuitry first prior
> getting to the sm checks?
> ____
>
I am wondering whether this mentioned in
https://gitlab.labs.nic.cz/turris/turris-build/issues/89 is the cause of
the issue perhaps:
Even when/after the SFP module is recognized and the link mode it set
for the NIC to the proper value there can still be the link-up signal
mismatch that we have seen on many non-ethernet SFPs. The thing is that
one of the SFP pins is called LOS (loss of signal) and when the pin is
in active state it is being interpreted by the Linux kernel as "link is
down", turn off the NIC. Unfortunatelly we have seen chicken-and-egg
problem with some GPON and DSL SFPs - the SFP does not come up and
deassert LOS unless there is SGMII link from NIC and NIC is not coming
up unless LOS is deasserted.
Powered by blists - more mailing lists