lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 31 Mar 2010 13:55:22 +0100
From:	Stuart Shelton <stuart@...nobjects.com>
To:	mchan@...adcom.com, netdev@...r.kernel.org
Subject: Severe regression in bnx2 driver with bonding in post 2.6.30 kernels


Hi all,

The Broadcom NetXtreme II driver appears to have a severe regression in 
all kernels post 2.6.30 - I've observed problems with 2.6.31, 2.6.32. 
and 2.6.33.

The hardware impacted is an IBM Bladecenter LS21 Blade, model 7971.  We 
have a large number of these, and all are affected.

We use generic channel-bonding, with the following options in modprobe.conf:

alias bond0 bonding
options bond0 mode=0 miimon=100

With any kernel prior to 2.6.31, the dmesg output reads:

Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.1 (May 6, 2009)
   alloc irq_desc for 17 on cpu 0 node 0
   alloc kstat_irqs on cpu 0 node 0
bnx2 0000:02:04.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
bnx2 0000:02:04.0: firmware: requesting bnx2/bnx2-mips-06-4.6.16.fw
...
bnx2 0000:02:04.0: firmware: requesting bnx2/bnx2-rv2p-06-4.6.16.fw
eth0: Broadcom NetXtreme II BCM5706 1000Base-SX (A2) PCI-X 64-bit 133MHz 
found at mem e2000000, IRQ 17, node addr 00:1a:64:bd:21:04
   alloc irq_desc for 18 on cpu 0 node 0
   alloc kstat_irqs on cpu 0 node 0
bnx2 0000:02:05.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
bnx2 0000:02:05.0: firmware: requesting bnx2/bnx2-mips-06-4.6.16.fw
bnx2 0000:02:05.0: firmware: requesting bnx2/bnx2-rv2p-06-4.6.16.fw
eth1: Broadcom NetXtreme II BCM5706 1000Base-SX (A2) PCI-X 64-bit 133MHz 
found at mem e4000000, IRQ 18, node addr 00:1a:64:be:20:80
udev: renamed network interface eth1 to eg1
udev: renamed network interface eth0 to eg0
...
   alloc irq_desc for 32 on cpu 0 node 0
   alloc kstat_irqs on cpu 0 node 0
bnx2 0000:02:04.0: irq 32 for MSI/MSI-X
bnx2: eg0: using MSI
bnx2: eg0 NIC SerDes Link is Up, 1000 Mbps full duplex, receive & 
transmit flow control ON
   alloc irq_desc for 33 on cpu 0 node 0
   alloc kstat_irqs on cpu 0 node 0
bnx2 0000:02:05.0: irq 33 for MSI/MSI-X
bnx2: eg1: using MSI
bnx2: eg1 NIC SerDes Link is Up, 1000 Mbps full duplex, receive & 
transmit flow control ON
bnx2 0000:02:04.0: irq 32 for MSI/MSI-X
bnx2: eg0: using MSI
bonding: bond0: enslaving eg0 as an active interface with a down link.
bnx2: eg0 NIC SerDes Link is Up, 1000 Mbps full duplex, receive & 
transmit flow control ON
bnx2 0000:02:05.0: irq 33 for MSI/MSI-X
bnx2: eg1: using MSI
bonding: bond0: enslaving eg1 as an active interface with a down link.
bonding: bond0: link status definitely up for interface eg0.
bonding: bond0: link status definitely up for interface eg1.
bnx2: eg1 NIC SerDes Link is Up, 1000 Mbps full duplex, receive & 
transmit flow control ON


... however, with kernels from 2.6.31 and later, the dmesg output reads:

Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.1 (May 6, 2009)
   alloc irq_desc for 17 on node 0
   alloc kstat_irqs on node 0
bnx2 0000:02:04.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
bnx2 0000:02:04.0: firmware: requesting bnx2/bnx2-mips-06-4.6.16.fw
bnx2 0000:02:04.0: firmware: requesting bnx2/bnx2-rv2p-06-4.6.16.fw
eth0: Broadcom NetXtreme II BCM5706 1000Base-SX (A2) PCI-X 64-bit 133MHz 
found at mem e2000000, IRQ 17, node addr 00:1a:64:bd:21:04
   alloc irq_desc for 18 on node 0
   alloc kstat_irqs on node 0
bnx2 0000:02:05.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
bnx2 0000:02:05.0: firmware: requesting bnx2/bnx2-mips-06-4.6.16.fw
bnx2 0000:02:05.0: firmware: requesting bnx2/bnx2-rv2p-06-4.6.16.fw
eth1: Broadcom NetXtreme II BCM5706 1000Base-SX (A2) PCI-X 64-bit 133MHz 
found at mem e4000000, IRQ 18, node addr 00:1a:64:be:20:80
udev: renamed network interface eth1 to eg1
udev: renamed network interface eth0 to eg0
...
   alloc irq_desc for 32 on node 0
   alloc kstat_irqs on node 0
bnx2 0000:02:04.0: irq 32 for MSI/MSI-X
bnx2: eg0: using MSI
bnx2: eg0 NIC SerDes Link is Up, 1000 Mbps full duplex, receive & 
transmit flow control ON
   alloc irq_desc for 33 on node 0
   alloc kstat_irqs on node 0
bnx2 0000:02:05.0: irq 33 for MSI/MSI-X
bnx2: eg0 NIC SerDes Link is Down
bnx2: eg0 NIC SerDes Link is Up, 1000 Mbps full duplex, receive & 
transmit flow control ON
bnx2: eg1: using MSI
bnx2: eg1 NIC SerDes Link is Up, 1000 Mbps full duplex, receive & 
transmit flow control ON
bnx2: Chip reset did not complete
bnx2: eg1 NIC SerDes Link is Down
bnx2: eg1 NIC SerDes Link is Up, 1000 Mbps full duplex, receive & 
transmit flow control ON
bnx2: fw sync timeout, reset code = 4040005
bnx2 0000:02:04.0: irq 32 for MSI/MSI-X
bnx2: Chip reset did not complete
bnx2 0000:02:04.0: irq 32 for MSI/MSI-X
bnx2: Chip reset did not complete
bnx2: Chip reset did not complete
bnx2: fw sync timeout, reset code = 4040005
bnx2 0000:02:05.0: irq 32 for MSI/MSI-X
bnx2: Chip reset did not complete
bnx2 0000:02:05.0: irq 32 for MSI/MSI-X
bnx2: Chip reset did not complete
NET: Registered protocol family 17
bnx2 0000:02:05.0: PCI INT A disabled
bnx2 0000:02:04.0: PCI INT A disabled
Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.1 (May 6, 2009)
bnx2 0000:02:04.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
bnx2 0000:02:04.0: firmware: requesting bnx2/bnx2-mips-06-4.6.16.fw
bnx2 0000:02:04.0: firmware: requesting bnx2/bnx2-rv2p-06-4.6.16.fw
eth0: Broadcom NetXtreme II BCM5706 1000Base-SX (A2) PCI-X 64-bit 133MHz 
found at mem e2000000, IRQ 17, node addr 00:1a:64:bd:21:04
bnx2 0000:02:05.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
bnx2 0000:02:05.0: firmware: requesting bnx2/bnx2-mips-06-4.6.16.fw
bnx2 0000:02:05.0: firmware: requesting bnx2/bnx2-rv2p-06-4.6.16.fw
eth1: Broadcom NetXtreme II BCM5706 1000Base-SX (A2) PCI-X 64-bit 133MHz 
found at mem e4000000, IRQ 18, node addr 00:1a:64:be:20:80
udev: renamed network interface eth0 to eg0
udev: renamed network interface eth1 to eg1
bnx2 0000:02:04.0: irq 32 for MSI/MSI-X
bnx2: Chip reset did not complete
bnx2 0000:02:05.0: irq 32 for MSI/MSI-X
bnx2: eg1: using MSI
bnx2: eg1 NIC SerDes Link is Up, 1000 Mbps full duplex, receive & 
transmit flow control ON
bnx2 0000:02:04.0: irq 33 for MSI/MSI-X
bnx2: eg1 NIC SerDes Link is Down
bnx2: eg1 NIC SerDes Link is Up, 1000 Mbps full duplex, receive & 
transmit flow control ON
bnx2: Chip reset did not complete
bnx2 0000:02:04.0: irq 33 for MSI/MSI-X
bnx2: Chip reset did not complete
bnx2: Chip reset did not complete
bnx2: fw sync timeout, reset code = 4040005
bnx2 0000:02:05.0: irq 32 for MSI/MSI-X
bnx2: Chip reset did not complete
bnx2 0000:02:05.0: irq 32 for MSI/MSI-X
bnx2: Chip reset did not complete
bnx2 0000:02:05.0: PCI INT A disabled
bnx2 0000:02:04.0: PCI INT A disabled
Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.1 (May 6, 2009)
bnx2 0000:02:04.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
bnx2 0000:02:04.0: firmware: requesting bnx2/bnx2-mips-06-4.6.16.fw
bnx2 0000:02:04.0: firmware: requesting bnx2/bnx2-rv2p-06-4.6.16.fw
eth0: Broadcom NetXtreme II BCM5706 1000Base-SX (A2) PCI-X 64-bit 133MHz 
found at mem e2000000, IRQ 17, node addr 00:1a:64:bd:21:04
bnx2 0000:02:05.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
bnx2 0000:02:05.0: firmware: requesting bnx2/bnx2-mips-06-4.6.16.fw
bnx2 0000:02:05.0: firmware: requesting bnx2/bnx2-rv2p-06-4.6.16.fw
udev: renamed network interface eth0 to eg0
eth0: Broadcom NetXtreme II BCM5706 1000Base-SX (A2) PCI-X 64-bit 133MHz 
found at mem e4000000, IRQ 18, node addr 00:1a:64:be:20:80
udev: renamed network interface eth0 to eg1
bnx2 0000:02:04.0: irq 32 for MSI/MSI-X
bnx2: Chip reset did not complete
bnx2 0000:02:04.0: irq 32 for MSI/MSI-X
bnx2: Chip reset did not complete
bnx2 0000:02:04.0: irq 32 for MSI/MSI-X
bnx2: Chip reset did not complete
bnx2 0000:02:04.0: irq 32 for MSI/MSI-X
bnx2: Chip reset did not complete

... (this later ouput showing the initial attempt to raise the 
interfaces at boot, and then me manually removing and re-inserting the 
bnx2 driver).  Alongside this, the console outputs "SIOCSIFFLAGS: Device 
or resource busy".

On these more recent kernels, the SIOCSIFFLAGS line is always output, 
but about 50% of the time the network interface is raised.  When this 
fails, then sometimes removing and re-inserting the bnx2 driver can 
result in usable non-bonded interfaces - but as often as not the NICs 
won't be usable even in a standard non-bonded configuration.

With a simple reboot back to a 2.6.30 or earlier kernel, the problem 
goes away (even though the firmware file on disk is the same as that 
used with the later kernels).  Ever blade we have is affected, so this 
is not a hardware problem (or at least, if it is, then it's a very 
common one!).  I thought that the problem might only occur when bonding 
is used - but I can't now recall what made me think this, and I've not 
been able to get the server down-time to extensively test the issue further.

Any advice/guidance greatly appreciated,

Stuart
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ