lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 26 Sep 2006 14:10:27 +0900
From:	Tejun Heo <htejun@...il.com>
To:	Molle Bestefich <molle.bestefich@...il.com>
CC:	linux-kernel@...r.kernel.org,
	"linux-ide@...r.kernel.org" <linux-ide@...r.kernel.org>
Subject: Re: SATA repeated failure (command 0x35 timeout, status 0xd8)

[cc'ing linux-ide]

Molle Bestefich wrote:
> Hi,
> 
> I have a box with the following hardware/software configuration.
> 
> # uname -r
> 2.6.16.5
> 
> # lspci | grep SATA
> 00:0b.0 CMD Technology Inc Silicon Image SiI 3112 SATARaid Controller 
> (rev 02)
> 00:0d.0 CMD Technology Inc Silicon Image SiI 3112 SATARaid Controller 
> (rev 02)
> 00:0f.0 CMD Technology Inc Silicon Image SiI 3112 SATARaid Controller 
> (rev 02)
> 
> # cd /sys/bus/scsi/devices; for dev in `ls -1`; do name=`ls -1d
> $dev/block*`; disk=`cat $dev/model`; echo "$name: $disk"; done
> 0:0:0:0/block:sda: Maxtor 6Y250M0
> 1:0:0:0/block:sdb: Maxtor 6Y200M0
> 2:0:0:0/block:sdc: Maxtor 6Y200M0
> 3:0:0:0/block:sdd: Maxtor 6Y200M0
> 4:0:0:0/block:sde: Maxtor 6Y200M0
> 5:0:0:0/block:sdf: Maxtor 6Y200M0
> 
> It can operate normally for the longest time, but all of a sudden the
> following happens.
> 
> # dmesg
> ata2: command 0x35 timeout, stat 0xd8 host_stat 0x1
> ata2: translated ATA stat/err 0xd8/00 to SCSI SK/ASC/ASCQ 0xb/47/00
> ata2: status=0xd8 { Busy }
> sd 1:0:0:0: SCSI error: return code = 0x8000002
> sdb: Current: sense key: Aborted Command
>    Additional sense: Scsi parity error
> end_request: I/O error, dev sdb, sector 323729879
> ATA: abnormal status 0xD8 on port 0xD095E0C7
> ATA: abnormal status 0xD8 on port 0xD095E0C7
> ATA: abnormal status 0xD8 on port 0xD095E0C7
> -- 
> ata2: command 0x35 timeout, stat 0xd8 host_stat 0x1
> ata2: translated ATA stat/err 0xd8/00 to SCSI SK/ASC/ASCQ 0xb/47/00
> ata2: status=0xd8 { Busy }
> sd 1:0:0:0: SCSI error: return code = 0x8000002
> sdb: Current: sense key: Aborted Command
>    Additional sense: Scsi parity error
> end_request: I/O error, dev sdb, sector 323729887
> ATA: abnormal status 0xD8 on port 0xD095E0C7
> ATA: abnormal status 0xD8 on port 0xD095E0C7
> ATA: abnormal status 0xD8 on port 0xD095E0C7
> 
> [ .. et cetera, always with sector += 8 .. ]
> 
> This usually happens during a write to the RAID.
> 
> Funny thing is, it always happens to /dev/sdb.
> It's happened 100s of times, but not once to any other device than 
> /dev/sdb.
> 
> Resetting and running Maxtor PowerMax reports that the drive has no
> hardware problems whatsoever.
> 
> I can't think of anything useful, except that /dev/sdb sits on the
> same controller card as /dev/sda, which is a slightly different disk
> than the rest of the bunch.
> 
> What can/should I do?

Please give a shot at 2.6.18.  It contains improved EH which 1. should 
be able to recover from such conditions 2. gives more detailed 
information about the nature of errors.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ