lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <200909292225.09188.denys@visp.net.lb>
Date:	Tue, 29 Sep 2009 22:25:09 +0300
From:	Denys Fedoryschenko <denys@...p.net.lb>
To:	linux-kernel@...r.kernel.org, linux-scsi@...r.kernel.org,
	Eric.Moore@....com
Subject: MPT Fusion SAS 2.6.31 regression, crash on heavy load

Filled a bugzilla entry, no answer for 3 days, and at same time it is clear 
regression.
http://bugzilla.kernel.org/show_bug.cgi?id=14242

While on 2.6.30.5 MPT SAS controller worked fine, on 2.6.31 it fails on heavy
operations and start spitting errors to dmesg (they vary). Failsystems also
stopped, and i am unable to reboot box properly (only over sysrq or 
hardreset). 

x86, Sun Fire X4100, 8 GB RAM, PAE kernel enabled, module loaded with default
options

I upgrade BIOS, LSI controller BIOS to latest version, it didn't fix the bug.
I cannot do bisection, because this is loaded server and semi-embedded system.
But i can do tests of patches or reverse specific commits, if you point me to
exact commit.

http://www.nuclearcat.com/files/dmesg.ok from 2.6.30.5 kernel
http://www.nuclearcat.com/files/dmesg.fail from 2.6.31.1 kernel
http://www.nuclearcat.com/files/config.gz config from 2.6.31.1 kernel

Let me know if you need any additional information.

Additionally - i have few other similar units (X4100), but with less amount of 
RAM (4GB),HDD's(2 only), less load (but still enough heavy at some moments) 
working ok. I dont think it is hardware issue, since it works on 2.6.30 very 
stable, and worked on other (older) kernels for 1 year and more. It is clear 
regression and i guess dangerous regression (causing data loss on high 
loads). I will try to bisect some changes on mpt driver today. 

Please CC me on answers, i am not subscribed at any SCSI/LSI list.

Crossposting to linux-kernel, since there is no mails about issue from 
linux-scsi.

Here is some technical info about controller over lsiutil
Current active firmware version is 01102800 (1.16.40)
Firmware image's version is MPTFW-01.16.40.00-IE
  LSI Logic
x86 BIOS image's version is MPTBIOS-6.14.04.00 (2007.02.27)

SAS1064's links are 3.0 G, 3.0 G, 3.0 G, 3.0 G

 B___T     SASAddress     PhyNum  Handle  Parent  Type
        50003ba0000003ba           0001           SAS Initiator
        50003ba0000003bb           0002           SAS Initiator
        50003ba0000003bc           0003           SAS Initiator
        50003ba0000003bd           0004           SAS Initiator
 0   0  500000e01277abd2     0     0005    0001   SAS Target
 0   1  500000e011e3b602     1     0006    0001   SAS Target
 0   2  500000e012779792     2     0007    0001   SAS Target
 0   3  500000e0120efb42     3     0008    0001   SAS Target

Type      NumPhys    PhyNum  Handle     PhyNum  Handle  Port  Speed
Adapter      4          0     0001  -->    0     0005     0    3.0
                        1     0001  -->    0     0006     1    3.0
                        2     0001  -->    0     0007     2    3.0
                        3     0001  -->    0     0008     3    3.0

Enclosure Handle   Slots       SASAddress       B___T (SEP)
           0001      4      50003ba0000003ba
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ