lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 11 Jan 2013 11:15:32 +0100
From:	Michael Hunold <hunold@...uxtv.org>
To:	linux-kernel@...r.kernel.org
CC:	andi@...as.de, linux@...nbow-software.org, myxal.mxl@...il.com,
	florian@...kler.org
Subject: Link state change detection problem on Moschip MCS7832 again

Hi,

I have a no-name Moschip MCS7832-based adapter shows a strange behaviour 
in my system after a system upgrade. "lsusb -vv" for that device is 
attached to the end of the mail.

I am using the adapter for embedded systems development, where it serves 
kernels via TFTP and root filesystems via NFS.

I have recently upgrade my system to Kubuntu 12.10 which uses a 3.5.0-21 
kernel. Before that upgrade the device was working fine with Xubuntu 10.10.

I have used the network-manager applet that comes with Kubuntu to assign 
a static IP address to that interface.

The symptom is that when the remote system's bootloader (u-boot in my 
case) starts to fetch the kernel via TFTP, it usually starts fine (a 
couple of "#" are shown to indicate progress), then timeouts are 
happening ("T" is shown), then progress continues, then more timeouts 
and so on.

I can see the following messages getting repeated in /var/log/syslog:

[...]
Jan 11 11:01:04 elmc-teemhu NetworkManager[1250]: <info> (eth1): carrier 
now OFF (device state 100, deferring action for 4 seconds)
Jan 11 11:01:04 elmc-teemhu NetworkManager[1250]: <info> (eth1): carrier 
now ON (device state 100)
[...]

I found the following bug report and this got me going:
https://bugzilla.kernel.org/show_bug.cgi?id=28532

Here is what I investigated so far.

1. I noticed that the patch dabdaf0caa3af520dbc1df87b2fb4e77224037bd 
from Ondrej Zary is missing in the kernel Kubuntu is serving, so I 
downloaded the most-recent mcs7830.c from kernel.org and recompiled the 
module. The problem stays the same, there is no improvement.

2. I undid both commits dabdaf0caa3af520dbc1df87b2fb4e77224037bd and 
b1ff4f96fd1c63890d78d8939c6e0f2b44ce3113 which added the "mcs7830: 
Implement link state detection" in the first place. Without that 
"feature" my adapter is now working reliably again.

3. Commit dabdaf0caa3af520dbc1df87b2fb4e77224037bd had the following 
comment:

"The device had an undocumented "feature": it can provide a sequence of
spurious link-down status data even if the link is up all the time.
A sequence of 10 was seen so update the link state only after the device
reports the same link state 20 times."

I tried to increase the number from 20 gradually, but it did not help to 
fix the problem. In my desparation I tried 100 as well, but this only 
postponed the

4. In my desparation, I went back to the most recent driver and added 
the following code to mcs7830_status() in order to track after how many 
calls to that function the link state changes.

[...]
{
	static int xxx_counter = 0;
	static int xxx_link = -1;
	if (link != xxx_link) {
		printk("counter %4d -> link %d\n", xxx_counter, link);
		xxx_link = link;
		xxx_counter = 0;
	} else {
		xxx_counter++;
	}
}
[...]

This resulted in the following output:

Jan 11 11:01:04 elmc-teemhu kernel: [11627.025109] counter  105 -> link 0
Jan 11 11:01:04 elmc-teemhu kernel: [11627.101840] counter   76 -> link 1
Jan 11 11:01:04 elmc-teemhu kernel: [11627.207724] counter  105 -> link 0
Jan 11 11:01:04 elmc-teemhu kernel: [11627.285582] counter   77 -> link 1
Jan 11 11:01:04 elmc-teemhu kernel: [11627.392416] counter  106 -> link 0
Jan 11 11:01:04 elmc-teemhu kernel: [11627.468149] counter   75 -> link 1
Jan 11 11:01:04 elmc-teemhu kernel: [11627.574036] counter  105 -> link 0
Jan 11 11:01:04 elmc-teemhu kernel: [11627.651893] counter   77 -> link 1
Jan 11 11:01:04 elmc-teemhu kernel: [11627.757719] counter  105 -> link 0
Jan 11 11:01:04 elmc-teemhu NetworkManager[1250]: <info> (eth1): carrier 
now OFF (device state 100, deferring action for 4 seconds)
Jan 11 11:01:04 elmc-teemhu kernel: [11627.834546] counter   76 -> link 1
Jan 11 11:01:04 elmc-teemhu NetworkManager[1250]: <info> (eth1): carrier 
now ON (device state 100)
Jan 11 11:01:05 elmc-teemhu kernel: [11627.939259] counter  104 -> link 0
Jan 11 11:01:05 elmc-teemhu kernel: [11628.018204] counter   78 -> link 1

So it seems the link state is constantly toggling and the network 
manager eventually picks that up and does some reconfiguratation to the 
network interface which disturbs both TFTP and NFS.

As I already said above, when I undo both commits then everything is 
working fine again. Network manager is not complaining any more and TFTP 
and NFS is working fine.

Any idea what is wrong with that adapter? Is it unable to report link 
state changes correctly at all?

How to make the current driver work correctly without reverting the two 
commit completly?

Best regards
Michael.

View attachment "lsusb-vv-moschip.txt" of type "text/plain" (2748 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ