[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4873E27C.6050504@msgid.tls.msk.ru>
Date: Wed, 09 Jul 2008 01:56:12 +0400
From: Michael Tokarev <mjt@....msk.ru>
To: Linux-kernel <linux-kernel@...r.kernel.org>
Subject: 2.6.25: random stalls on certain hardware - regression?
No hope to resolve this but still - maybe someone has an idea...
Hardware is -
AMD Athlon x2-64 system
Asus M2N-SLI DELUXE motherboard (nvidia MCP55)
AMD BE-2400 CPU
2x 2Gb mem
Adaptec 3950B Ultra2 SCSI adapter (*)
4x 36Gb Seagate SCSI disks
2x nvidia GbE ethernet
rtl8139 nic as 3rd one
When booting 2.6.25 - either 32 or 64 bits - the system freezes/hangs
at some random point. So far there was about 10 hangs, some after
several minutes after boot, some are within hours.
The "hang" is a complete system freeze - on the console there's
still "... login:" prompt, but nothing works - keyboard is stuck
(numlock doesn't work), and server is not responding over network.
Sometimes ping to the server itself works, but definitely not routing.
(it's a server so no fancy stuff is loaded - X isn't even installed).
The kernel is vanilla 2.6.25 - tried several - .5, .8, .10 now -
the effect is the same.
2.6.24 and before worked without any glitch (2.6.24 is currently
running). We tried different versions of BIOS (was 12something
before, tried 1304 and - currently - 1405, since 1502 is still
beta) - no difference at all.
The problem is that it is a production machine, and quite some
people depend on it (it's a remote office with only one server),
so I've very limited ability to try something. Unfortunately not
git bisect, -- or at least I'm afraid to try it, both because of
possibility to have many reboots AND new freezes, and because
unstable kernel (2.6.25pre stuff) with possibility to break something.
Here's the modules:
ipt_REJECT 3968 2
xt_tcpudp 3712 1
xt_comment 2432 8
iptable_filter 3456 1
ip_tables 11536 1 iptable_filter
x_tables 13316 4 ipt_REJECT,xt_tcpudp,xt_comment,ip_tables
quota_v2 9728 6
xfs 518456 1
raid0 7040 1
raid10 19456 1
usblp 12416 0
raid456 116764 4
async_xor 3072 1 raid456
async_memcpy 2432 1 raid456
async_tx 2944 1 raid456
xor 15756 2 raid456,async_xor
usb_storage 81984 0
it87 20368 0
hwmon_vid 3584 1 it87
hwmon 3100 1 it87
ohci_hcd 21380 0
ehci_hcd 31756 0
usbcore 119536 5 usblp,usb_storage,ohci_hcd,ehci_hcd
8250_pnp 10112 0
8250 24196 1 8250_pnp
serial_core 18304 1 8250
forcedeth 46988 0
8139too 22656 0
mii 5376 1 8139too
ext3 119944 6
jbd 39828 1 ext3
mbcache 7300 1 ext3
aic7xxx 166968 32
scsi_transport_spi 22144 1 aic7xxx
raid1 19712 3
md_mod 69532 13 raid0,raid10,raid456,raid1
sd_mod 25112 36
scsi_mod 127144 4 usb_storage,aic7xxx,scsi_transport_spi,sd_mod
i.e, nothing fancy at all, all the standard modules.
The difference in compilation - between all 2.6.25 and first 2.6.24 -
is the compiler, currently it's Debian 4.2.3-3 (testing), before it
was some older version. But latest 2.6.24.7 were also compiled with
this very compiler and it works like usual.
Config is attached, just in case (i686smp).
Any idea how to start welcome. Meanwhile I'll try current 2.6.26pre,
in a hope it will not destroy all our data.. ;)
Thanks!
(*) the hardware may seem a bit strange - an old SCSI controller on
a pretty modern system - but it was other, about 10 years old system
where the motherboard failed and has been replaced with this asus
one, but the drives (and the controller) are ok. The mobo has enough
PCI slots (and 2 NICs) so that all the devices are here - something
I had.. difficulties to find.
/mjt
View attachment "i686smp" of type "text/plain" (34903 bytes)
Powered by blists - more mailing lists