lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080922112452.GB5314@ff.dom.local>
Date:	Mon, 22 Sep 2008 11:24:52 +0000
From:	Jarek Poplawski <jarkao2@...il.com>
To:	Badalian Vyacheslav <slavon@...telecom.ru>
Cc:	Denys Fedoryshchenko <denys@...p.net.lb>, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: Machine Check Exception Re: NetDev! Please help!

On Mon, Sep 22, 2008 at 01:40:35PM +0400, Badalian Vyacheslav wrote:
> Thanks for answer Jarek!
> I post it is bugtrack - http://bugzilla.kernel.org/show_bug.cgi?id=11618
> 
> I not think that its hardware error because this problem we have in 10
> servers on 2.6.26.2 kernel +)
> On Friday night i compile 2.6.26.5 and have 2 panic on 1 pc what have
> max load and 1 panic on other pc.
> I write to netdev list because first messages looks like:
> 
> [ 4956.420298] CPU 1: Machine Check Exception: 0000000000000005
> [ 4956.420298] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> [ 4956.420300]   Tx Queue             <0>
> [ 4956.420300]   TDH                  <81>
> [ 4956.420301]   TDT                  <81>
> [ 4956.420302]   next_to_use          <81>
> [ 4956.420302]   next_to_clean        <d6>
> [ 4956.420303] buffer_info[next_to_clean]
> [ 4956.420303]   time_stamp           <15498d>
> [ 4956.420304]   next_to_watch        <d6>
> [ 4956.420304]   jiffies              <15511c>
> [ 4956.420305]   next_to_watch.status <1>
> [ 4956.420537] eth1: Detected Tx Unit Hang:
> [ 4956.420538]   TDH                  <b0>
> [ 4956.420538]   TDT                  <b0>
> [ 4956.420539]   next_to_use          <b0>
> [ 4956.420539]   next_to_clean        <5>
> [ 4956.420540] buffer_info[next_to_clean]:
> [ 4956.420540]   time_stamp           <15498e>
> [ 4956.420541]   next_to_watch        <5>
> [ 4956.420542]   jiffies              <15511c>
> [ 4956.420542]   next_to_watch.status <1>
> [ 4956.423064] CPU 1: Bank 0: 3200004000000800
> [ 4956.423190] CPU 1: Bank 5: 3200220024080400
> [ 4956.423315] Kernel panic - not syncing: CPU context corrupt
> [ 4956.423933] Rebooting in 3 seconds..

Yes, similar messages are often netdev problems, but not with
this Machine Check Exception with this CPU context corrupt,
which should mean some severe hardware problem (unless some bug,
probably not netdev, triggers them).

> 
> But in 2.6.26.5 i not see errors like this 2 days... Also if system not have network load - i can't do panic by cpuburn or compiling sources...
> Anyone i think its good that my message also go to general mail-list and bugzilla...
> 
> I try get more info... if you or anyone have idea how test this bug - i can do it)

I see you have some advice in bugzilla. These people really know more
about these things, so you should try this first. I think, they expect
you to compile the most current kernel version (tip) using git for
this. You can do this using the instructions from Ingo Molnar's README.
Make a script from this: from the beginning to the "git checkout ...".
Of course you have to install git before. After running the commands
it will download the kernel sources to a subdir (takes time). Copy your
config there, make oldconfig, make etc. Then send them dmesg after
rebooting. If you have any problems - write. Alternatively, I guess,
you could try the current 2.6.27-rc7 kernel at least.

Jarek P.

BTW: could you try to trigger this bug with one network card off?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ