[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <49171F49.4040606@shaw.ca>
Date: Sun, 09 Nov 2008 11:35:05 -0600
From: Robert Hancock <hancockr@...w.ca>
To: linux-kernel@...r.kernel.org
Subject: Re: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
Corey Bertram wrote:
> Forgive me if this isn't proper for this list but throughout sept-oct
> there was thread 'exception Emask 0x0 SAct 0x1 / SErr 0x0 action 0x2
> frozen'. Discussion continued till mid October and then died off.
> It touched on issues with people seeing the following on disk
> intensive transactions I believe.
> [11285.918535] ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
> [11285.918567] ata9.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0
> ncq 1536 out
> [11285.918568] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [11285.918619] ata9.00: status: { DRDY }
> [11285.918635] ata9: hard resetting link
First of all, I should caution that this is about the most generic error
imaginable, a command timeout, so I'd caution against assuming that any
other reports of the same message are due to the same cause.
>
> As of today I'm running:
> Linux isis 2.6.27-7-server #1 SMP Tue Nov 4 20:16:57 UTC 2008 x86_64 GNU/Linux
>
> I'm running a 7 disk software raid 5 on a SuperMicro AOC-SAT2-MV8:
> SCSI storage controller: Marvell Technology Group Ltd. MV88SX6081
> 8-port SATAII PCI-X Controller (rev 09)
>
> As far as the drives go, they're 500GB's each:
> 4 are Seagate ST3500630AS (7200.10)
> 3 are WDC WD5000AAKS-00YGA0 (yes i know, the dreaded AAKS)
> *The errors appear to happen on all of the drives; I've changed the
> ordering (because the logs make it appear that 3 drives are the ones
> with issues, and it just changes which drives report the errors).
Is it always the same ports?
7 drives is a fair number, are you sure your PSU can handle it? These
kinds of problems often turn out to be cabling issues, insufficient
power, etc.
>
> The filesystem is XFS.
>
> These problems manifested when i moved from CentOS 2.6.18 32bit to
> Ubuntu 2.6.27.2.2 64Bit. The drives and controller where rock solid. I
> reinstalled due to a mobo/cpu change so i cant with certainty say its
> not hardware related, but giving other peoples findings i would rule
> out the changed hardware. I even replaced the controller and cables
> with no change in results.
>
> I initially reported this to the ubuntu bugtracker late august and
> just walked away from the system until today when i attempted to
> upgrade (to ubuntu 8.10) and try again.
> Typical behavior: Upon transferring after about 20GB's or so it took
> out my raid and caused all the disks to desync. I managed to get all
> but one disk to resync but i _wont_ be stressing the raid (including a
> rebuild) until this issue is fixed so i'm not sure what more i can
> provide as far as logs go.
>
> Bug behavior: On large transfers (I've honestly only tried 4-12GB
> transfers), the transfers seem to work at first, then stalls for ~30
> seconds, recovers, transfer for 60 seconds or so, then stall again,
> constantly repeating this cycle.
>
> Does _anyone_ have any update on this issue?
> Original bug i created:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/263160/
> similar posting it would seem:
> https://bugzilla.redhat.com/show_bug.cgi?id=462425
>
> *NOTE* yes these logs are 'old' but the current ones looked exactly
> the same. I didn't dump any recent ones due to that fact. I cant
> reproduce this anymore because i cant risk loosing the data in the
> raid.
> dmesg:
> [11285.918535] ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
> [11285.918567] ata9.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0
> ncq 1536 out
> [11285.918568] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [11285.918619] ata9.00: status: { DRDY }
> [11285.918635] ata9: hard resetting link
> [11286.420039] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [11286.460065] ata9.00: max_sectors limited to 256 for NCQ
> [11286.520054] ata9.00: max_sectors limited to 256 for NCQ
> [11286.520059] ata9.00: configured for UDMA/133
> [11286.520077] ata9: EH complete
> [11286.520119] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
> [11286.520132] sd 8:0:0:0: [sdd] Write Protect is off
> [11286.520134] sd 8:0:0:0: [sdd] Mode Sense: 00 3a 00 00
> [11286.520154] sd 8:0:0:0: [sdd] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> [11326.988529] ata8.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
> [11326.988554] ata8.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0
> ncq 1536 out
> [11326.988555] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [11326.988606] ata8.00: status: { DRDY }
> [11326.988623] ata8: hard resetting link
> [11327.500037] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [11327.580053] ata8.00: max_sectors limited to 256 for NCQ
> [11327.657199] ata8.00: max_sectors limited to 256 for NCQ
> [11327.657202] ata8.00: configured for UDMA/133
> [11327.657207] ata8: EH complete
> [11327.657257] sd 7:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
> [11327.657272] sd 7:0:0:0: [sdc] Write Protect is off
> [11327.657273] sd 7:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> [11327.657296] sd 7:0:0:0: [sdc] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> [11377.938532] ata7.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
> [11377.938557] ata7.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0
> ncq 1536 out
> [11377.938558] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [11377.938608] ata7.00: status: { DRDY }
> [11377.938624] ata7: hard resetting link
> [11378.440037] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [11378.520056] ata7.00: max_sectors limited to 256 for NCQ
> [11378.600065] ata7.00: max_sectors limited to 256 for NCQ
> [11378.600068] ata7.00: configured for UDMA/133
> [11378.600073] ata7: EH complete
> [11378.600120] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
> [11378.600133] sd 6:0:0:0: [sdb] Write Protect is off
> [11378.600135] sd 6:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> [11378.600155] sd 6:0:0:0: [sdb] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> [11711.718523] ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
> [11711.718548] ata9.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0
> ncq 1536 out
> [11711.718549] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [11711.718600] ata9.00: status: { DRDY }
> [11711.718616] ata9: hard resetting link
> [11712.220041] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [11712.260058] ata9.00: max_sectors limited to 256 for NCQ
> [11712.320057] ata9.00: max_sectors limited to 256 for NCQ
> [11712.320066] ata9.00: configured for UDMA/133
> [11712.320072] ata9: EH complete
> [11712.320112] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
> [11712.320125] sd 8:0:0:0: [sdd] Write Protect is off
> [11712.320127] sd 8:0:0:0: [sdd] Mode Sense: 00 3a 00 00
> [11712.320148] sd 8:0:0:0: [sdd] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> [11849.328524] ata7.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
> [11849.328549] ata7.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0
> ncq 1536 out
> [11849.328549] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [11849.328600] ata7.00: status: { DRDY }
> [11849.328617] ata7: hard resetting link
> [11849.830037] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [11849.910070] ata7.00: max_sectors limited to 256 for NCQ
> [11849.990053] ata7.00: max_sectors limited to 256 for NCQ
> [11849.990057] ata7.00: configured for UDMA/133
> [11849.990069] ata7: EH complete
> [11849.990109] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
> [11849.990123] sd 6:0:0:0: [sdb] Write Protect is off
> [11849.990125] sd 6:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> [11849.990147] sd 6:0:0:0: [sdb] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> [11909.629773] ata9.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
> [11909.629797] ata9.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0
> ncq 1536 out
> [11909.629798] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [11909.629849] ata9.00: status: { DRDY }
> [11909.629865] ata9: hard resetting link
> [11910.131295] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [11910.180068] ata9.00: max_sectors limited to 256 for NCQ
> [11910.231316] ata9.00: max_sectors limited to 256 for NCQ
> [11910.231319] ata9.00: configured for UDMA/133
> [11910.231327] ata9: EH complete
> [11910.231381] sd 8:0:0:0: [sdd] 976773168 512-byte hardware sectors (500108 MB)
> [11910.231394] sd 8:0:0:0: [sdd] Write Protect is off
> [11910.231396] sd 8:0:0:0: [sdd] Mode Sense: 00 3a 00 00
> [11910.231417] sd 8:0:0:0: [sdd] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FUA
> [11996.729773] ata7.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
> [11996.729797] ata7.00: cmd 61/03:00:49:00:00/00:00:00:00:00/40 tag 0
> ncq 1536 out
> [11996.729798] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> [11996.729848] ata7.00: status: { DRDY }
> [11996.729865] ata7: hard resetting link
> [11997.231291] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [11997.311308] ata7.00: max_sectors limited to 256 for NCQ
> [11997.391306] ata7.00: max_sectors limited to 256 for NCQ
> [11997.391316] ata7.00: configured for UDMA/133
> [11997.391322] ata7: EH complete
> [11997.391366] sd 6:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
> [11997.391378] sd 6:0:0:0: [sdb] Write Protect is off
> [11997.391380] sd 6:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> [11997.391400] sd 6:0:0:0: [sdb] Write cache: enabled, read cache:
> enabled, doesn't support DPO or FU
>
> /var/log/messages:
> Aug 30 20:12:43 isis kernel: [11285.918635] ata9: hard resetting link
> Aug 30 20:12:43 isis kernel: [11286.420039] ata9: SATA link up 3.0
> Gbps (SStatus 123 SControl 300)
> Aug 30 20:12:43 isis kernel: [11286.460065] ata9.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:12:43 isis kernel: [11286.520054] ata9.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:12:43 isis kernel: [11286.520059] ata9.00: configured for UDMA/133
> Aug 30 20:12:43 isis kernel: [11286.520077] ata9: EH complete
> Aug 30 20:12:43 isis kernel: [11286.520119] sd 8:0:0:0: [sdd]
> 976773168 512-byte hardware sectors (500108 MB)
> Aug 30 20:12:43 isis kernel: [11286.520132] sd 8:0:0:0: [sdd] Write
> Protect is off
> Aug 30 20:12:43 isis kernel: [11286.520154] sd 8:0:0:0: [sdd] Write
> cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Aug 30 20:13:24 isis kernel: [11326.988623] ata8: hard resetting link
> Aug 30 20:13:24 isis kernel: [11327.500037] ata8: SATA link up 3.0
> Gbps (SStatus 123 SControl 300)
> Aug 30 20:13:24 isis kernel: [11327.580053] ata8.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:13:24 isis kernel: [11327.657199] ata8.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:13:24 isis kernel: [11327.657202] ata8.00: configured for UDMA/133
> Aug 30 20:13:24 isis kernel: [11327.657207] ata8: EH complete
> Aug 30 20:13:24 isis kernel: [11327.657257] sd 7:0:0:0: [sdc]
> 976773168 512-byte hardware sectors (500108 MB)
> Aug 30 20:13:24 isis kernel: [11327.657272] sd 7:0:0:0: [sdc] Write
> Protect is off
> Aug 30 20:13:24 isis kernel: [11327.657296] sd 7:0:0:0: [sdc] Write
> cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Aug 30 20:14:15 isis kernel: [11377.938624] ata7: hard resetting link
> Aug 30 20:14:15 isis kernel: [11378.440037] ata7: SATA link up 3.0
> Gbps (SStatus 123 SControl 300)
> Aug 30 20:14:15 isis kernel: [11378.520056] ata7.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:14:15 isis kernel: [11378.600065] ata7.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:14:15 isis kernel: [11378.600068] ata7.00: configured for UDMA/133
> Aug 30 20:14:15 isis kernel: [11378.600073] ata7: EH complete
> Aug 30 20:14:15 isis kernel: [11378.600120] sd 6:0:0:0: [sdb]
> 976773168 512-byte hardware sectors (500108 MB)
> Aug 30 20:14:15 isis kernel: [11378.600133] sd 6:0:0:0: [sdb] Write
> Protect is off
> Aug 30 20:14:15 isis kernel: [11378.600155] sd 6:0:0:0: [sdb] Write
> cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Aug 30 20:19:48 isis kernel: [11711.718616] ata9: hard resetting link
> Aug 30 20:19:49 isis kernel: [11712.220041] ata9: SATA link up 3.0
> Gbps (SStatus 123 SControl 300)
> Aug 30 20:19:49 isis kernel: [11712.260058] ata9.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:19:49 isis kernel: [11712.320057] ata9.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:19:49 isis kernel: [11712.320066] ata9.00: configured for UDMA/133
> Aug 30 20:19:49 isis kernel: [11712.320072] ata9: EH complete
> Aug 30 20:19:49 isis kernel: [11712.320112] sd 8:0:0:0: [sdd]
> 976773168 512-byte hardware sectors (500108 MB)
> Aug 30 20:19:49 isis kernel: [11712.320125] sd 8:0:0:0: [sdd] Write
> Protect is off
> Aug 30 20:19:49 isis kernel: [11712.320148] sd 8:0:0:0: [sdd] Write
> cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Aug 30 20:22:06 isis kernel: [11849.328617] ata7: hard resetting link
> Aug 30 20:22:06 isis kernel: [11849.830037] ata7: SATA link up 3.0
> Gbps (SStatus 123 SControl 300)
> Aug 30 20:22:06 isis kernel: [11849.910070] ata7.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:22:07 isis kernel: [11849.990053] ata7.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:22:07 isis kernel: [11849.990057] ata7.00: configured for UDMA/133
> Aug 30 20:22:07 isis kernel: [11849.990069] ata7: EH complete
> Aug 30 20:22:07 isis kernel: [11849.990109] sd 6:0:0:0: [sdb]
> 976773168 512-byte hardware sectors (500108 MB)
> Aug 30 20:22:07 isis kernel: [11849.990123] sd 6:0:0:0: [sdb] Write
> Protect is off
> Aug 30 20:22:07 isis kernel: [11849.990147] sd 6:0:0:0: [sdb] Write
> cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Aug 30 20:23:06 isis kernel: [11909.629865] ata9: hard resetting link
> Aug 30 20:23:07 isis kernel: [11910.131295] ata9: SATA link up 3.0
> Gbps (SStatus 123 SControl 300)
> Aug 30 20:23:07 isis kernel: [11910.180068] ata9.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:23:07 isis kernel: [11910.231316] ata9.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:23:07 isis kernel: [11910.231319] ata9.00: configured for UDMA/133
> Aug 30 20:23:07 isis kernel: [11910.231327] ata9: EH complete
> Aug 30 20:23:07 isis kernel: [11910.231381] sd 8:0:0:0: [sdd]
> 976773168 512-byte hardware sectors (500108 MB)
> Aug 30 20:23:07 isis kernel: [11910.231394] sd 8:0:0:0: [sdd] Write
> Protect is off
> Aug 30 20:23:07 isis kernel: [11910.231417] sd 8:0:0:0: [sdd] Write
> cache: enabled, read cache: enabled, doesn't support DPO or FUA
> Aug 30 20:24:33 isis kernel: [11996.729865] ata7: hard resetting link
> Aug 30 20:24:34 isis kernel: [11997.231291] ata7: SATA link up 3.0
> Gbps (SStatus 123 SControl 300)
> Aug 30 20:24:34 isis kernel: [11997.311308] ata7.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:24:34 isis kernel: [11997.391306] ata7.00: max_sectors
> limited to 256 for NCQ
> Aug 30 20:24:34 isis kernel: [11997.391316] ata7.00: configured for UDMA/133
> Aug 30 20:24:34 isis kernel: [11997.391322] ata7: EH complete
> Aug 30 20:24:34 isis kernel: [11997.391366] sd 6:0:0:0: [sdb]
> 976773168 512-byte hardware sectors (500108 MB)
> Aug 30 20:24:34 isis kernel: [11997.391378] sd 6:0:0:0: [sdb] Write
> Protect is off
> Aug 30 20:24:34 isis kernel: [11997.391400] sd 6:0:0:0: [sdb] Write
> cache: enabled, read cache: enabled, doesn't support DPO or FUA
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists