lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 10 Mar 2010 21:09:32 -0500
From:	Brian Haley <brian.haley@...com>
To:	Michael Chan <mchan@...adcom.com>
CC:	Bruno Prémont <bonbons@...ux-vserver.org>,
	Benjamin Li <benli@...adcom.com>,
	NetDEV <netdev@...r.kernel.org>,
	Linux-Kernel <linux-kernel@...r.kernel.org>
Subject: Re: BNX2: Kernel crashes with 2.6.31 and 2.6.31.9

Michael Chan wrote:
> On Wed, 2010-03-10 at 15:09 -0800, Brian Haley wrote:
>> Brian Haley wrote:
>>> Hi Michael,
>>>
>>> Michael Chan wrote:
>>>> Do we have timers running in this environment?  The timer in the bnx2
>>>> driver, bnx2_timer(), needs to run to provide a heart beat to the
>>>> firmware.  In netpoll mode without timer interrupts, if we are regularly
>>>> calling the NAPI poll function, it should also be able to provide the
>>>> heartbeat.  Without the heartbeat, the firmware will reset the chip and
>>>> result in the NETDEV WATCHDOG.
>>> We have also been seeing watchdog timeouts with bnx2, below is a
>>> stack trace with Benjamin's debug patch applied.  Normally we were
>>> only seeing them under heavy load, but this one was at boot.  We haven't
>>> tried the latest firmware/driver from 2.6.33 yet.  You can contact me
>>> offline if you need more detailed info.
>> Following-up since I have more info on this issue.
>>
>> I'm able to cause a netdev_watchdog timeout by changing the coalesce
>> settings on my bnx2, I built a little test program for it:
> 
> Do you run this program in a loop?  How quickly do you see the NETDEV
> WATCHDOG?

It's run once, and we see it almost immediately after ETHTOOL_SCOALESCE.

>> 	ecoal.rx_coalesce_usecs = 0;
>> 	ecoal.rx_max_coalesced_frames = 1;
>> 	ecoal.rx_coalesce_usecs_irq = 0;
>> 	ecoal.rx_max_coalesced_frames_irq = 1;
> 
> These rx settings should be ok.  Did you change the tx settings?  If the
> tx settings are all zeros, you won't get any TX interrupts and you can
> get a NETDEV WATCHDOG.

We did the read, so the TX should be what it was originally.

> Run ethtool -c eth0 to see what the tx settings are.  Thanks.

# ethtool -c eth0
Coalesce parameters for eth0:
Adaptive RX: off  TX: off
stats-block-usecs: 999936
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 0
rx-frames: 1
rx-usecs-irq: 0
rx-frames-irq: 1

tx-usecs: 80
tx-frames: 20
tx-usecs-irq: 18
tx-frames-irq: 2

rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0

rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0

If I run 'ethtool -c eth0' after the watchdog triggers either the NIC
or system completely hangs.

-Brian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists