lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87618083B2453E4A8714035B62D67992504EC970@FMSMSX105.amr.corp.intel.com>
Date:	Thu, 24 Dec 2015 05:58:19 +0000
From:	"Tantilov, Emil S" <emil.s.tantilov@...el.com>
To:	zhuyj <zyjzyj2000@...il.com>,
	"Kirsher, Jeffrey T" <jeffrey.t.kirsher@...el.com>,
	"Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
	"Nelson, Shannon" <shannon.nelson@...el.com>,
	"Wyborny, Carolyn" <carolyn.wyborny@...el.com>,
	"Skidmore, Donald C" <donald.c.skidmore@...el.com>,
	"Allan, Bruce W" <bruce.w.allan@...el.com>,
	"Ronciak, John" <john.ronciak@...el.com>,
	"Williams, Mitch A" <mitch.a.williams@...el.com>,
	"intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"e1000-devel@...ts.sourceforge.net" 
	<e1000-devel@...ts.sourceforge.net>
CC:	"Viswanathan, Ven (Wind River)" <venkat.viswanathan@...driver.com>,
	"Shteinbock, Boris (Wind River)" <boris.shteinbock@...driver.com>,
	"Bourg, Vincent (Wind River)" <vincent.bourg@...driver.com>
Subject: RE: [Intel-wired-lan] [PATCH 1/1] ixgbe: force to synchronize
 reporting "link on" and getting speed and duplex

>-----Original Message-----
>From: zhuyj [mailto:zyjzyj2000@...il.com]
>Sent: Wednesday, December 23, 2015 6:28 PM
>To: Tantilov, Emil S; Kirsher, Jeffrey T; Brandeburg, Jesse; Nelson,
>Shannon; Wyborny, Carolyn; Skidmore, Donald C; Allan, Bruce W; Ronciak,
>John; Williams, Mitch A; intel-wired-lan@...ts.osuosl.org;
>netdev@...r.kernel.org; e1000-devel@...ts.sourceforge.net
>Cc: Viswanathan, Ven (Wind River); Shteinbock, Boris (Wind River); Bourg,
>Vincent (Wind River)
>Subject: Re: [Intel-wired-lan] [PATCH 1/1] ixgbe: force to synchronize
>reporting "link on" and getting speed and duplex
>
>On 12/23/2015 11:59 PM, Tantilov, Emil S wrote:
>>> -----Original Message-----
>>> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@...ts.osuosl.org]
>On
>>> Behalf Of zyjzyj2000@...il.com
>>> Sent: Tuesday, December 22, 2015 10:47 PM
>>> To: Kirsher, Jeffrey T; Brandeburg, Jesse; Nelson, Shannon; Wyborny,
>>> Carolyn; Skidmore, Donald C; Allan, Bruce W; Ronciak, John; Williams,
>Mitch
>>> A; intel-wired-lan@...ts.osuosl.org; netdev@...r.kernel.org; e1000-
>>> devel@...ts.sourceforge.net
>>> Cc: Viswanathan, Ven (Wind River); Shteinbock, Boris (Wind River);
>Bourg,
>>> Vincent (Wind River)
>>> Subject: [Intel-wired-lan] [PATCH 1/1] ixgbe: force to synchronize
>>> reporting "link on" and getting speed and duplex
>>>
>>> From: Zhu Yanjun <zyjzyj2000@...il.com>
>>>
>>> In X540 NIC, there is a time span between reporting "link on" and
>>> getting the speed and duplex. To a bonding driver in 802.3ad mode,
>>> this time span will make it not work well if the time span is big
>>> enough. The big time span will make bonding driver change the state of
>>> the slave device to up while the speed and duplex of the slave device
>>> can not be gotten. Later the bonding driver will not have change to
>>> get the speed and duplex of the slave device. The speed and duplex of
>>> the slave device are important to a bonding driver in 802.3ad mode.
>>>
>>> To 82599_SFP NIC and other kinds of NICs, this problem does
>>> not exist. As such, it is necessary for X540 to report"link on" when
>>> the link speed is not IXGBE_LINK_SPEED_UNKNOWN.
>>>
>>> Signed-off-by: Zhu Yanjun <zyjzyj2000@...il.com>
>>> ---
>>> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   16 +++++++++++++++-
>>> 1 file changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> index aed8d02..cb9d310 100644
>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>>> @@ -6479,7 +6479,21 @@ static void ixgbe_watchdog_link_is_up(struct
>>> ixgbe_adapter *adapter)
>>> 	       (flow_rx ? "RX" :
>>> 	       (flow_tx ? "TX" : "None"))));
>>>
>>> -	netif_carrier_on(netdev);
>>> +	/*
>>> +	 * In X540 NIC, there is a time span between reporting "link on"
>>> +	 * and getting the speed and duplex. To a bonding driver in 802.3ad
>>> +	 * mode, this time span will make it not work well if the time span
>>> +	 * is big enough. To 82599_SFP NIC and other kinds of NICs, this
>>> +	 * problem does not exist. As such, it is better for X540 to report
>>> +	 * "link on" when the link speed is not IXGBE_LINK_SPEED_UNKNOWN.
>>> +	 */
>>> +	if ((hw->mac.type == ixgbe_mac_X540) &&
>>> +	    (link_speed != IXGBE_LINK_SPEED_UNKNOWN)) {
>>> +		netif_carrier_on(netdev);
>>> +	} else {
>>> +		netif_carrier_on(netdev);
>>> +	}
>>> +
>>> 	ixgbe_check_vf_rate_limit(adapter);
>>>
>>> 	/* enable transmits */
>>> --
>>> 1.7.9.5
>> NAK
>>
>> I have already submitted a patch that will address the issue with bonding
>reporting
>> unknown speed (in /proc/bonding/bondX) after the link is established due
>to link flaps:
>> http://patchwork.ozlabs.org/patch/552485/
>>
>> The bonding driver gets the speed from ethtool and this is where the
>reporting needs
>> to be fixed. The issue is that the bonding driver polls for
>netif_carrier_ok() at a
>> certain rate and as such will not be able to detect rapid link changes.
>Thanks for your reply. The root cause is different from my problem. My
>problem is that
>"link up" is prior to "speed and duplex". That is, the physical NIC
>reports "link up" while

The "link up" event is a result of an LSC interrupt, the speed is 
determined as result of that interrupt by checking the LINKS register.
If the LINKS register reports link as unknown then that is the actual state 
of the PHY - meaning the device is re-negotiating the speed for some reason.

>the speed is unknown at the same time. We can run "ethtool ethx" to
>confirm it.

Prior to my patch the ethtool call will read the LINKS register which can show
speed as unknown due to a link flap (for example). You are seeing the momentary 
state of the device.

If you are still seeing the bond reporting "unknown" speed after the patch I pointed
out  please file a bug either through e1000.sf.net or via Intel support and provide
detailed information about the bonding setup, the type of the link partner (switch 
model etc) and full dmesg from the failed scenario along with the output from
/proc/bonding/bond0

Thanks,
Emil

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ