linux-kernel - ICH8 libata SATA periodic drive failure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-Id: <1188484182.9953.11.camel@station-1.ad.isillc.com>
Date:	Thu, 30 Aug 2007 09:29:42 -0500
From:	Jerome Haltom <jhaltom@...llc.com>
To:	linux-kernel@...r.kernel.org
Subject: ICH8 libata SATA periodic drive failure

I have two systems exhibiting a similar problem, even after swapping out
hard drives. Both systems have Intel ICH8 controllers. One is a server
class board with a Xeon, the other is a desktop class board with a Core
Duo. Both are Intel brand boards.

The server box is unable to use a drive in it's 4th, 5th or 6th SATA
port. The drive appears fine, and works fine for a few minutes, but
eventually starts spitting out errors:

[6494068.025887] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[6494068.073915] ata6.00: ata_hpa_resize 1: sectors = 781422768,
hpa_sectors = 781422768
[6494068.132112] ata6.00: ata_hpa_resize 1: sectors = 781422768,
hpa_sectors = 781422768
[6494068.132118] ata6.00: configured for UDMA/33
[6494068.132123] ata6: EH complete
[6494068.132163] SCSI device sdd: 781422768 512-byte hdwr sectors
(400088 MB)
[6494068.132176] sdd: Write Protect is off
[6494068.132179] sdd: Mode Sense: 00 3a 00 00
[6494068.132196] SCSI device sdd: write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
[6500148.311961] ata6: exception Emask 0x10 SAct 0x0 SErr 0x40d0002
action 0x2 frozen
[6500148.312013] ata6: (irq_stat 0x04400040, connection status changed)
[6500148.855772] ata6: waiting for device to spin up (8 secs)
[6500157.017604] ata6: soft resetting port

I have tried three drives in these ports. Ports 1, 2 and 3 all work
perfectly. Also have switched cables and such, including exchanging a
known working cable from port #1 with port#4, etc.

The desktop board I have has similar but slightly different issues.

[54733.636292] ata4: exception Emask 0x10 SAct 0x0 SErr 0x10000 action
0x2 frozen
[54733.636301] ata4: (irq_stat 0x00400000, PHY RDY changed)
[54734.352050] ata4: soft resetting port
[54736.875575] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[54736.978632] ata4.00: configured for UDMA/33
[54736.978636] ata4: EH complete
[54736.979560] sd 3:0:0:0: [sdc] 976773168 512-byte hardware sectors
(500108 MB)
[54736.983558] sd 3:0:0:0: [sdc] Write Protect is off
[54736.983562] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[54736.991557] sd 3:0:0:0: [sdc] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA

The difference is that ports 1, 2, 3 and 4 work fine. It's just port 5
that has an issue. I have yet to try port 6. There is a DVD drive on it
which seems to work fine.

Now, my first thought was maybe my power-supply couldn't power all the
drives. Figured if I just reorganized the plugs, maybe I could cause a
different drive to starve... but nope. It's consistently the same SATA
ports. If it was a power problem, I would not expect it to stay
localized to the same SATA ports.

The server machine is running Ubuntu Feisty with kernel
2.6.20-15-generic. The desktop machine is running Gutsy with kernel
2.6.22-10-generic.

The driver errors aren't yet good enough for me to make a diagnosis.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/