[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e18b0fb9-0c6d-ed5e-3a20-dc29e9cc048e@gmx.net>
Date: Fri, 10 Jan 2020 18:44:18 +0000
From: ѽ҉ᶬḳ℠ <vtol@....net>
To: Russell King - ARM Linux admin <linux@...linux.org.uk>
Cc: Andrew Lunn <andrew@...n.ch>, netdev@...r.kernel.org
Subject: Re: [drivers/net/phy/sfp] intermittent failure in state machine
checks
On 10/01/2020 17:38, Russell King - ARM Linux admin wrote:
>
>>> On Fri, Jan 10, 2020 at 04:53:06PM +0000, ѽ҉ᶬḳ℠ wrote:
>>>> Seems that the debug avenue has been exhausted, short of running SFP.C in
>>>> debug mode.
>>> You're saying you never see TX_FAULT asserted other than when the
>>> interface is down?
>> Yes, it never exhibits once the iif is up - it is rock-stable in that state,
>> only ever when being transitioned from down state to up state.
>> Pardon, if that has not been made explicitly clear previously.
> I think if we were to have SFP debug enabled, you'll find that
> TX_FAULT is being reported to SFP as being asserted.
If really necessary I could ask the TOS developers to assist, not sure
whether they would oblidge. Their Master branch build bot compiles twice
a day.
Would it just involve setting a kernel debug flag or something more
elaborate?
>
> You probably aren't running that while loop, as it will exit when
> it sees TX_FAULT asserted. So, here's another bit of shell code
> for you to run:
>
> ip li set dev eth2 down; \
> ip li set dev eth2 up; \
> date
> while :; do
> cat /proc/uptime
> while ! grep -A5 'tx-fault.*in hi' /sys/kernel/debug/gpio; do :; done
> cat /proc/uptime
> while ! grep -A5 'tx-fault.*in lo' /sys/kernel/debug/gpio; do :; done
> done
>
> This will give you output such as:
>
> Fri 10 Jan 17:31:06 GMT 2020
> 774869.13 1535859.48
> gpio-509 ( |tx-fault ) in hi ...
> 774869.14 1535859.49
> gpio-509 ( |tx-fault ) in lo ...
> 774869.15 1535859.50
>
> The first date and "uptime" output is the timestamp when the interface
> was brought up. Subsequent "uptime" outputs can be used to calculate
> the time difference in seconds between the state printed immediately
> prior to the uptime output, and the first "uptime" output.
>
> So in the above example, the tx-fault signal was hi at 10ms, and then
> went low 20ms after the up.
awfully nice of you to provide the code, this is the output when the iif
is brought down and up again and exhibiting the transmit fault.
ip li set dev eth2 down; \
> ip li set dev eth2 up; \
> date
Fri Jan 10 18:34:52 GMT 2020
root@to:~# while :; do
> cat /proc/uptime
> while ! grep -A5 'tx-fault.*in hi' /sys/kernel/debug/gpio; do :; done
> cat /proc/uptime
> while ! grep -A5 'tx-fault.*in lo' /sys/kernel/debug/gpio; do :; done
> done
1865.20 3224.67
gpio-504 ( |tx-fault ) in hi IRQ
gpio-505 ( |tx-disable ) out hi
gpio-506 ( |rate-select0 ) in lo
gpio-507 ( |los ) in lo IRQ
gpio-508 ( |mod-def0 ) in lo IRQ
1871.77 3230.71
gpio-504 ( |tx-fault ) in lo IRQ
gpio-505 ( |tx-disable ) out lo
gpio-506 ( |rate-select0 ) in lo
gpio-507 ( |los ) in lo IRQ
gpio-508 ( |mod-def0 ) in lo IRQ
1919.06 3309.55
gpio-504 ( |tx-fault ) in hi IRQ
gpio-505 ( |tx-disable ) out lo
gpio-506 ( |rate-select0 ) in lo
gpio-507 ( |los ) in lo IRQ
gpio-508 ( |mod-def0 ) in lo IRQ
1919.07 3309.57
gpio-504 ( |tx-fault ) in lo IRQ
gpio-505 ( |tx-disable ) out lo
gpio-506 ( |rate-select0 ) in lo
gpio-507 ( |los ) in lo IRQ
gpio-508 ( |mod-def0 ) in lo IRQ
1920.68 3312.28
gpio-504 ( |tx-fault ) in hi IRQ
gpio-505 ( |tx-disable ) out lo
gpio-506 ( |rate-select0 ) in lo
gpio-507 ( |los ) in lo IRQ
gpio-508 ( |mod-def0 ) in lo IRQ
1921.86 3314.21
gpio-504 ( |tx-fault ) in lo IRQ
gpio-505 ( |tx-disable ) out lo
gpio-506 ( |rate-select0 ) in lo
gpio-507 ( |los ) in lo IRQ
gpio-508 ( |mod-def0 ) in lo IRQ
1921.86 3314.21
> However, bear in mind that even this will not be good enough to spot
> transitory changes on TX_FAULT - as your I2C GPIO expander is interrupt
> capable, watching /proc/interrupts may tell you more.
>
> If the TX_FAULT signal is as stable as you claim it is, you should see
> the interrupt count for it remaining the same.
Once the iif is up those values remain stable indeed.
cat /proc/interrupts | grep sfp
52: 0 0 pca953x 4 Edge sfp
53: 0 0 pca953x 3 Edge sfp
54: 6 0 pca953x 0 Edge sfp
and only incrementing with ifupdown action (which would be logical)
cat /proc/interrupts | grep sfp
52: 0 0 pca953x 4 Edge sfp
53: 0 0 pca953x 3 Edge sfp
54: 11 0 pca953x 0 Edge sfp
Powered by blists - more mailing lists