lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6DD3782C33561D44B47071B09946026405F65F9333@exchange1>
Date:	Wed, 26 Jan 2011 12:44:52 +0000
From:	"Mills, Tony" <tony.mills@...ex.com>
To:	Michael Chan <mchan@...adcom.com>
CC:	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: bnx2 cards intermittantly going offline

Hi, Thanks for your response. 

I have done some further investigation and found that we have had a massive amount of interrupts occurring on our Broadcom cards, i have set the affinity for irq 36 and 48 to run on the first two vcpu's on the box and the java processes to run on the others. I have also replaced the debian bnx2 driver with the latest from the Broadcom website and made the rx ring buffer to 4080, this has stopped a multiplexed server running at 1.6 cycles per second from missing cycles due to interrupts on the interface and allowing much better processing time, and the ring buffer up at 4080 stops the rx_fw_discards i was seeing periodically, (even upping that to 1020 or 2040 did not sort the issue but the maximum setting from the Broadcom driver does. 

I am now monitoring the system to see if the card ever becomes unresponsive. But i do have a question. 

If i setup the smp_affinity with a mask of cpu 0 and 1 (the first two on the box) for the APIC-fasteoi irq's for the Ethernet devices, it appears that the kernel does not balance and will continue to use the same cpu to do the interrupts even though there is processing power on the other one. Is this a known issue or am i doing something wrong?

Can i add it's on an dell r610 with a 12 core intel Xeon X5680, this shows up as 24 vcpu's. 

Best Regards

Tony Mills


-----Original Message-----
From: Michael Chan [mailto:mchan@...adcom.com] 
Sent: 18 January 2011 17:56
To: Mills, Tony
Cc: netdev@...r.kernel.org
Subject: Re: bnx2 cards intermittantly going offline


On Tue, 2011-01-18 at 02:54 -0800, Mills, Tony wrote:
> Last night i setup a machine to monitor overnight and at 3:52 this
> morning it became unresponsive. 
> 

When it becomes unresponsive, please send some packets to the NIC (such
as ping) and monitor statistics with ethtool -S.  See if the packets are
being received or discarded.  Also, run tcpdump on the machine to see if
the packets are properly received by the stack.  Thanks.


-- 
IMPORTANT NOTICE

The sender does not guarantee that this message, including any attachment, is secure
or virus free. Also, it is confidential and may be privileged or otherwise protected
from disclosure. If you are not the intended recipient, do not disclose or copy it
or its contents. Please telephone or email the sender and delete the message
entirely from your system.
Jagex Limited is a company registered in England & Wales with company number
03982706 and a registered office at St John's Innovation Centre, Cowley Road, 
Cambridge, CB4 0WS, UK.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ