lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 9 Jan 2020 19:42:27 +0000
From:   ѽ҉ᶬḳ℠ <vtol@....net>
To:     Russell King - ARM Linux admin <linux@...linux.org.uk>
Cc:     Andrew Lunn <andrew@...n.ch>, netdev@...r.kernel.org
Subject: Re: [drivers/net/phy/sfp] intermittent failure in state machine
 checks

On 09/01/2020 19:01, ѽ҉ᶬḳ℠ wrote:
> On 09/01/2020 17:43, Russell King - ARM Linux admin wrote:
>> On Thu, Jan 09, 2020 at 05:35:23PM +0000, ѽ҉ᶬḳ℠ wrote:
>>> Thank you for the extensive feedback and explanation.
>>>
>>> Pardon for having mixed up the semantics on module specifications 
>>> vs. EEPROM
>>> dump...
>>>
>>> The module (chipset) been designed by Metanoia, not sure who is the 
>>> actual
>>> manufacturer, and probably just been branded Allnet.
>>> The designer provides some proprietary management software (called 
>>> EBM) to
>>> their wholesale buyers only
>> I have one of their early MT-V5311 modules, but it has no accessible
>> EEPROM, and even if it did, it would be of no use to me being
>> unapproved for connection to the BT Openreach network.  (BT SIN 498
>> specifies non-standard power profile to avoid crosstalk issues with
>> existing ADSL infrastructure, and I believe they regularly check the
>> connected modem type and firmware versions against an approved list.)
>>
>> I haven't noticed the module I have asserting its TX_FAULT signal,
>> but then its RJ45 has never been connected to anything.
>>
>
> The curious (and sort of inexplicable) thing is that the module in 
> general works, i.e. at some point it must pass the sm checks or 
> connectivity would be failing constantly and thus the module being 
> generally unusable.
>
> The reported issues however are intermittent, usually reliably 
> reproducible with
>
> ifdown <iface> && ifup <iface>
>
> or rebooting the router that hosts the module.
>
> If some times passes, not sure but seems in excess of 3 minutes, 
> between ifdown and ifup the sm checks mostly are not failing.
> It somehow "feels" that the module is storing some link signal 
> information in a register which does not suit the sm check routine and 
> only when that register clears the sm check routine passes and 
> connectivity is restored.
> ____
>
> Since there are probably other such SFP modules, xDSL and g.fast, out 
> there that do not provide laser safety circuitry by design (since not 
> providing connectivity over fibre) would it perhaps not make sense to 
> try checking for the existence of laser safety circuitry first prior 
> getting to the sm checks?
> ____
>

I am wondering whether this mentioned in 
https://gitlab.labs.nic.cz/turris/turris-build/issues/89 is the cause of 
the issue perhaps:

Even when/after the SFP module is recognized and the link mode it set 
for the NIC to the proper value there can still be the link-up signal 
mismatch that we have seen on many non-ethernet SFPs. The thing is that 
one of the SFP pins is called LOS (loss of signal) and when the pin is 
in active state it is being interpreted by the Linux kernel as "link is 
down", turn off the NIC. Unfortunatelly we have seen chicken-and-egg 
problem with some GPON and DSL SFPs - the SFP does not come up and 
deassert LOS unless there is SGMII link from NIC and NIC is not coming 
up unless LOS is deasserted.




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ