[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAPu=dv3Px-Y0RRuqasps21=1F4s0Gc9iM5JEQxo-WpVxBKvUOw@mail.gmail.com>
Date: Tue, 31 Jul 2018 16:57:24 +0300
From: Özkan Göksu <ozkan.goksu@...shi.com>
To: linux-scsi@...r.kernel.org
Cc: linux-kernel@...r.kernel.org
Subject: mpt3sas_cm2: attempting host reset! scmd(ffff9e8a88623d48
Hello.
When a disk starts to give I/O error mpt3sas going to reset device first
and after the reset if disk still gives "task abort" and "I/O error" then
mpt3sas resets the HBA card. Because of the problem I lost 1 HBA(LSI) card
"this means 100 disk at same time" for 30-50 second.
I was have the problem since 9-10 months so i found a solution with adding
extra timeout to my disks.
--> "for drive in /sys/block/sd*; do echo 180 > $drive/device/timeout;
done"
After increase disk timeout, for a long time i saw I/O errors but mpt3sas
did not reset HBA card and i was thinking this problem solved But after few
months same problem occured some how.
Now i'm looking for a better solution. Any idea please?
This is the dmesg; https://paste.ubuntu.com/p/ZWcRK3BTRV/
OS: Arch Linux 4.14.40-1-lts #1 SMP Wed May 9 13:00:32 CEST 2018 x86_64
GNU/Linux
[Mon Jul 23 05:16:09 2018] mpt3sas_cm2: attempting host reset!
scmd(ffff9e8a88623d48)
[Mon Jul 23 05:16:09 2018] mpt3sas_cm2: sending diag reset !!
[Mon Jul 23 05:16:10 2018] mpt3sas_cm2: diag reset: SUCCESS
[Mon Jul 23 05:16:10 2018] mpt3sas_cm2: LSISAS3008: FWVersion(13.00.00.00),
ChipRevision(0x02), BiosVersion(08.35.00.00)
[Mon Jul 23 05:16:10 2018] mpt3sas_cm2: Protocol=(
[Mon Jul 23 05:16:10 2018] mpt3sas_cm2: sending port enable !!
[Mon Jul 23 05:16:17 2018] mpt3sas_cm2: port enable: SUCCESS
[Mon Jul 23 05:16:17 2018] mpt3sas_cm2: search for end-devices: start
[Mon Jul 23 05:16:17 2018] mpt3sas_cm2: search for end-devices: complete
[Mon Jul 23 05:16:17 2018] mpt3sas_cm2: search for expanders: start
[Mon Jul 23 05:16:17 2018] mpt3sas_cm2: search for expanders: complete
[Mon Jul 23 05:16:17 2018] mpt3sas_cm2: host reset: SUCCESS
scmd(ffff9e8a88623d48)
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: removing unresponding devices: start
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: removing unresponding devices:
end-devices
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: removing unresponding devices:
expanders
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: removing unresponding devices:
complete
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: scan devices: start
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: scan devices: expanders start
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: break from expander scan:
ioc_status(0x0022), loginfo(0x310f0400)
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: scan devices: expanders complete
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: scan devices: end devices start
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: break from end device scan:
ioc_status(0x0022), loginfo(0x310f0400)
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: scan devices: end devices complete
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: scan devices: complete
Content of type "text/html" skipped
Powered by blists - more mailing lists