lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Fri, 17 Jul 2009 10:15:52 -0400
From:	Neil Horman <nhorman@...driver.com>
To:	David Hill <hilld@...arystorm.net>
Cc:	Andrew Morton <akpm@...ux-foundation.org>, netdev@...r.kernel.org,
	bugzilla-daemon@...zilla.kernel.org,
	bugme-daemon@...zilla.kernel.org
Subject: Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled
	inkernel, computer crashes after 120seconds (approx)

On Fri, Jul 17, 2009 at 01:55:44AM -0400, David Hill wrote:
> Hi back,
> Look at bug 13219.  I'm not sure the bug is related to NETCONSOLE.
> It may be with the NIC drivers or the tools miidiag/ethtool or anything  
> else.
> The behavior of the system is random.
>
> I attached the NMI stack trace ... but for the kdump, I need to read a 
> bit more about it and think I'll need to patch the kernel... will I ?
>
> Thanks again,
>
> Dave
>
Neither of the logs you attached in the associated bugs seem to have the NMI
lockup backtrace included.  As for a kdump, you won't need to patch the kernel,
no, but depending on what kernel you're using, you may need to build the kernel
with CONFIG_CRASH and CONFIG_KEXEC turned on.

Neil

>
> ----- Original Message ----- From: "David Hill" <hilld@...arystorm.net>
> To: "Neil Horman" <nhorman@...driver.com>; "Andrew Morton"  
> <akpm@...ux-foundation.org>
> Cc: <netdev@...r.kernel.org>; <bugzilla-daemon@...zilla.kernel.org>;  
> <bugme-daemon@...zilla.kernel.org>
> Sent: Thursday, July 16, 2009 1:42 AM
> Subject: Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled  
> inkernel, computer crashes after 120seconds (approx)
>
>
>> Will try that in the next few days... sorry for the delay.  I was on  
>> vacation for the last 2 weeks and thus, out of town :D
>>
>>
>>
>> ----- Original Message ----- From: "Neil Horman" 
>> <nhorman@...driver.com>
>> To: "Andrew Morton" <akpm@...ux-foundation.org>
>> Cc: <netdev@...r.kernel.org>; <bugzilla-daemon@...zilla.kernel.org>;  
>> <bugme-daemon@...zilla.kernel.org>; <hilld@...arystorm.net>
>> Sent: Tuesday, June 23, 2009 9:05 PM
>> Subject: Re: [Bugme-new] [Bug 13553] New: When NETCONSOLE is enabled  
>> inkernel, computer crashes after 120seconds (approx)
>>
>>
>>> On Tue, Jun 23, 2009 at 02:07:43PM -0700, Andrew Morton wrote:
>>>>
>>>> (switched to email.  Please respond via emailed reply-to-all, not 
>>>> via the
>>>> bugzilla web interface).
>>>>
>>>> On Wed, 17 Jun 2009 01:55:54 GMT
>>>> bugzilla-daemon@...zilla.kernel.org wrote:
>>>>
>>>> > http://bugzilla.kernel.org/show_bug.cgi?id=13553
>>>> >
>>>> >            Summary: When NETCONSOLE is enabled in kernel, 
>>>> computer > crashes
>>>> >                     after 120seconds (approx)
>>>> >            Product: Networking
>>>> >            Version: 2.5
>>>> >     Kernel Version: 2.6.29.4, 2.6.30
>>>> >           Platform: All
>>>> >         OS/Version: Linux
>>>> >               Tree: Mainline
>>>> >             Status: NEW
>>>> >           Severity: high
>>>> >           Priority: P1
>>>> >          Component: Other
>>>> >         AssignedTo: acme@...stprotocols.net
>>>> >         ReportedBy: hilld@...arystorm.net
>>>> >         Regression: No
>>>> >
>>>> >
>>>>
>>>> > 00:00.0 Host bridge: Intel Corporation 440GX - 82443GX Host bridge
>>>> > 00:01.0 PCI bridge: Intel Corporation 440GX - 82443GX AGP bridge
>>>> > 00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
>>>> > 00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE 
>>>> (rev > 01)
>>>> > 00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB 
>>>> (rev > 01)
>>>> > 00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
>>>> > 00:0b.0 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>>>> > 00:0b.1 SCSI storage controller: Adaptec AIC-7896U2/7897U2
>>>> > 00:0d.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 
>>>> Ethernet > Pro 100
>>>> > (rev 08)
>>>> > 00:12.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>>>> > RTL-8139/8139C/8139C+ (rev 10)
>>>> > 01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 
>>>> RL/VR > AGP
>>>> >
>>>> > ------- Comment #2 From David Hill 2009-06-17 02:55:56 (-) > 
>>>> [reply] -------
>>>> >
>>>> > With NETCONSOLE enabled, if I type:
>>>> > ethtool -s eth1 speed 100 duplex full autoneg on
>>>> >
>>>> > the computer freezes with kernel 2.6.29.4 and 2.6.30...
>>>> >
>>>> > I can reproduce it anytime you want.
>>>> >
>>>>
>>>> Interesting.  I wonder what the significance is of the 120 seconds.  I
>>>> see no such timers in e100.c.  Does the networking core have timers on
>>>> such intervals?
>>>>
>>> My guess is the 120 seconds has less to do with the driver, and more 
>>> to do with
>>> some other periodic event in the kernel that triggers a message 
>>> getting written
>>> to the console, which in turn triggers whatever deadlock it is thats  
>>> getting hit
>>> here.  I imagine we could diagnose it pretty quick if a stack trace 
>>> or vmcore
>>> could be captured on this.  David, can you enable the NMI watchdog on 
>>> this
>>> system to trigger a panic on the system after a deadlock?  Then if 
>>> you could
>>> enable a second serial console, or setup kdump to capture a vmcore on 
>>> this
>>> system, we should be able to  figure out whats going on.  My guess is 
>>> that in
>>> the e100 driver we're taking a lock in the ethtool set path, then calling
>>> printk, which winds up recursing into the driver, trying to take the 
>>> same lock
>>> again.  A stack trace will tell us for certain.
>>>
>>> Regards
>>> Neil
>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>> the body of a message to majordomo@...r.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>> -- 
>>> This message has been scanned for viruses and
>>> dangerous content by MailScanner, and is
>>> believed to be clean.
>>>
>>>
>>>
>>
>
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ