lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1454311675-24676-1-git-send-email-zyjzyj2000@gmail.com>
Date:	Mon, 1 Feb 2016 15:27:54 +0800
From:	<zyjzyj2000@...il.com>
To:	<zyjzyj2000@...il.com>, <emil.s.tantilov@...el.com>,
	<phillip.j.schmitt@...el.com>, <jeffrey.t.kirsher@...el.com>,
	<netdev@...r.kernel.org>, <e1000-devel@...ts.sourceforge.net>,
	<Boris.Shteinbock@...driver.com>
Subject: ixgbe: get link speed as a slave nic unrelated with link 


Hi, Emil

Thanks for your patch.
After I applied your patch, the following are the feedback from my users.

"
Users had tested the latest patch that you provided and it is much improved now. However it’s still not good enough as the users are planning field deployment. Here are their findings:

So close, but not quite 100%. I did run over 2500 re-negotiations on one interface of a bonded pair and got the 0 MBps status total of three times. The longest run without single error was something like 1800 re-negotiations or so. So, this version seems to improve the situation immensely (the unpatched driver fails like 25% of the time), but there still seems to remain some tiny race somewhere.

So  it seems the failure occurs once every 600-900 connections.
"

I delved into the source code. And I found that maybe this time slice can result in this problem.

bonding                ixgbe 
  |                     |
  |                    carrier_on
  |                     |
  |    <----------------|
 link_up                |
  |                     |
  |                    carrier_off
  |                     |
 get_link_speed ------->|
  |                     |

Now bonding driver is link up while speed is link_speed_unknown because of link flap.

To an independent nic, it is meaningless to get link speed while carrier is off. But to a slave nic, maybe it is helpful, especially nic link flaps.

Maybe this patch can fix the above time slice.

Any reply is appreciated.

Best Regards!
Zhu Yanjun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ