lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 09 Jul 2008 01:56:12 +0400
From:	Michael Tokarev <mjt@....msk.ru>
To:	Linux-kernel <linux-kernel@...r.kernel.org>
Subject: 2.6.25: random stalls on certain hardware - regression?

No hope to resolve this but still - maybe someone has an idea...

Hardware is -
  AMD Athlon x2-64 system
  Asus M2N-SLI DELUXE motherboard (nvidia MCP55)
  AMD BE-2400 CPU
  2x 2Gb mem
  Adaptec 3950B Ultra2 SCSI adapter (*)
  4x 36Gb Seagate SCSI disks
  2x nvidia GbE ethernet
  rtl8139 nic as 3rd one

When booting 2.6.25 - either 32 or 64 bits - the system freezes/hangs
at some random point.  So far there was about 10 hangs, some after
several minutes after boot, some are within hours.

The "hang" is a complete system freeze - on the console there's
still "... login:" prompt, but nothing works - keyboard is stuck
(numlock doesn't work), and server is not responding over network.
Sometimes ping to the server itself works, but definitely not routing.
(it's a server so no fancy stuff is loaded - X isn't even installed).

The kernel is vanilla 2.6.25 - tried several - .5, .8, .10 now -
the effect is the same.

2.6.24 and before worked without any glitch (2.6.24 is currently
running).  We tried different versions of BIOS (was 12something
before, tried 1304 and - currently - 1405, since 1502 is still
beta) - no difference at all.

The problem is that it is a production machine, and quite some
people depend on it (it's a remote office with only one server),
so I've very limited ability to try something.  Unfortunately not
git bisect, -- or at least I'm afraid to try it, both because of
possibility to have many reboots AND new freezes, and because
unstable kernel (2.6.25pre stuff) with possibility to break something.

Here's the modules:

ipt_REJECT              3968  2
xt_tcpudp               3712  1
xt_comment              2432  8
iptable_filter          3456  1
ip_tables              11536  1 iptable_filter
x_tables               13316  4 ipt_REJECT,xt_tcpudp,xt_comment,ip_tables
quota_v2                9728  6
xfs                   518456  1
raid0                   7040  1
raid10                 19456  1
usblp                  12416  0
raid456               116764  4
async_xor               3072  1 raid456
async_memcpy            2432  1 raid456
async_tx                2944  1 raid456
xor                    15756  2 raid456,async_xor
usb_storage            81984  0
it87                   20368  0
hwmon_vid               3584  1 it87
hwmon                   3100  1 it87
ohci_hcd               21380  0
ehci_hcd               31756  0
usbcore               119536  5 usblp,usb_storage,ohci_hcd,ehci_hcd
8250_pnp               10112  0
8250                   24196  1 8250_pnp
serial_core            18304  1 8250
forcedeth              46988  0
8139too                22656  0
mii                     5376  1 8139too
ext3                  119944  6
jbd                    39828  1 ext3
mbcache                 7300  1 ext3
aic7xxx               166968  32
scsi_transport_spi     22144  1 aic7xxx
raid1                  19712  3
md_mod                 69532  13 raid0,raid10,raid456,raid1
sd_mod                 25112  36
scsi_mod              127144  4 usb_storage,aic7xxx,scsi_transport_spi,sd_mod

i.e, nothing fancy at all, all the standard modules.

The difference in compilation - between all 2.6.25 and first 2.6.24 -
is the compiler, currently it's Debian 4.2.3-3 (testing), before it
was some older version.  But latest 2.6.24.7 were also compiled with
this very compiler and it works like usual.

Config is attached, just in case (i686smp).

Any idea how to start welcome.  Meanwhile I'll try current 2.6.26pre,
in a hope it will not destroy all our data.. ;)

Thanks!

(*) the hardware may seem a bit strange - an old SCSI controller on
a pretty modern system - but it was other, about 10 years old system
where the motherboard failed and has been replaced with this asus
one, but the drives (and the controller) are ok.  The mobo has enough
PCI slots (and 2 NICs) so that all the devices are here - something
I had.. difficulties to find.

/mjt

View attachment "i686smp" of type "text/plain" (34903 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ