lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160212155847.GA22910@xanadu.blop.info>
Date:	Fri, 12 Feb 2016 16:58:47 +0100
From:	Lucas Nussbaum <lucas.nussbaum@...ia.fr>
To:	Linux Kernel Network Developers <netdev@...r.kernel.org>
Cc:	David Ertman <davidx.m.ertman@...el.com>,
	Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
	Bruce Allan <bruce.w.allan@...el.com>, linux.nics@...el.com,
	e1000-devel@...ts.sourceforge.net
Subject: e1000e: initialization breaks IPMI support on 80003ES2LAN

Hi,

We have Intel 80003ES2LAN nics on SGI Altix XE310 servers (SuperMicro
Baseboard, with product name X7DGT). On those machines, one of the NIC
port is bridged internally with the BMC.

It seems that during the e1000e driver initialization (at boot time),
the NIC is reset, which causes the BMC to stop working (it stops
responding to pings, the IPMI SOL console stops working; a reboot
restores it, until the next driver initialization).

The exact same problem also occurs on Bull Novascale R422E1 nodes, with
the SuperMicro X7DWT baseboard.

It worked in the past (we noticed it when upgrading from Debian wheezy
(3.2 kernel) to Debian jessie (3.16 kernel)). Using git bisect, I tracked
this down to commit 2800209994f878b00724ceabb65d744855c8f99a (included
in Linux 3.15).

Digging further (as this commit is quite large), it seems that this
specific change introduced the problem:

--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3687,10 +3687,6 @@ void e1000e_power_up_phy(struct e1000_adapter *adapter)
  */
 static void e1000_power_down_phy(struct e1000_adapter *adapter)
 {
-       /* WoL is enabled */
-       if (adapter->wol)
-               return;
-
        if (adapter->hw.phy.ops.power_down)
                adapter->hw.phy.ops.power_down(&adapter->hw);
 }

I also confirmed that reverting this change on top of a 4.4 kernel, with
the following patch, fixes the problem (i.e. the BMC works again).

--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3791,6 +3791,8 @@ void e1000e_power_up_phy(struct e1000_adapter *adapter)
  */
 static void e1000_power_down_phy(struct e1000_adapter *adapter)
 {
+       return;
+
        if (adapter->hw.phy.ops.power_down)
                adapter->hw.phy.ops.power_down(&adapter->hw);
 }


My understanding is that, during driver initialization, e1000e_reset()
is called, which calls e1000_power_down_phy(), which breaks the BMC.

Given the comment above the code that was removed, I suspected that it
could also break WoL, but I haven't confirmed that.

I can test patches if needed.
-- 
| Lucas Nussbaum      Assistant professor @ Univ. de Lorraine |
| lucas.nussbaum@...ia.fr                     LORIA / MADYNES |
| http://www.loria.fr/~lnussbau/            +33 3 54 95 86 19 |

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ