[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20110701163719.27afc580.taeuber@bbaw.de>
Date: Fri, 1 Jul 2011 16:37:19 +0200
From: Lars Täuber <taeuber@...w.de>
To: linux-kernel@...r.kernel.org
Subject: Re: [PROBLEM] reproduceable storage errors on high IO load
Same with new 2TB Seagate Constellation ES ST2000NM0011 connected to the
areca ARC1300 (mvsas).
It only takes a simple »dd if=/dev/zero of=/dev/sd_« to provoke the problem.
Connected to the onboard AMD_AHCI controller (3.0 Gbps) both disks can be
formatted. Also the dd command line doesn't harm anything.
But there are some messages in dmesg if I do this with the disks still connected to the AHCI:
# mdadm -C /dev/md3 -l5 -n3 /dev/sd[cd] missing
# mke2fs -Fj /dev/md3
in dmesg:
[ 1515.340662] md: bind<sdc>
[ 1515.378861] md: bind<sdd>
[ 1515.470912] md/raid:md3: device sdd operational as raid disk 1
[ 1515.470919] md/raid:md3: device sdc operational as raid disk 0
[ 1515.471728] md/raid:md3: allocated 3230kB
[ 1515.471798] md/raid:md3: raid level 5 active with 2 out of 3 devices,
algorit hm 2
[ 1515.471933] RAID conf printout:
[ 1515.471938] --- level:5 rd:3 wd:2
[ 1515.471944] disk 0, o:1, dev:sdc
[ 1515.471949] disk 1, o:1, dev:sdd
[ 1515.472008] md3: detected capacity change from 0 to 4000797687808
[ 1515.472765] md3: unknown partition table
[ 1918.040121] ata6.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x6 frozen [ 1918.040259] ata6.00: failed command: WRITE FPDMA QUEUED
[ 1918.040367] ata6.00: cmd 61/00:00:00:00:b4/04:00:cc:00:00/40 tag 0 ncq 524288 out [ 1918.040371] res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 1918.040625] ata6.00: status: { DRDY }
[ 1918.040718] ata6.00: failed command: WRITE FPDMA QUEUED
[ 1918.040822] ata6.00: cmd 61/00:08:00:04:b4/04:00:cc:00:00/40 tag 1 ncq 524288 out [ 1918.040825] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 1918.041078] ata6.00: status: { DRDY }
[ 1918.041173] ata6: hard resetting link
[ 1918.041202] ata5.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x6 frozen [ 1918.041315] ata5.00: failed command: WRITE FPDMA QUEUED
[ 1918.041422] ata5.00: cmd 61/00:00:00:00:b4/04:00:cc:00:00/40 tag 0 ncq 524288 out [ 1918.041426] res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 1918.041681] ata5.00: status: { DRDY }
[ 1918.041772] ata5.00: failed command: WRITE FPDMA QUEUED
[ 1918.041877] ata5.00: cmd 61/00:08:00:04:b4/04:00:cc:00:00/40 tag 1 ncq 524288 out [ 1918.041880] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 1918.042133] ata5.00: status: { DRDY }
[ 1918.042227] ata5: hard resetting link
[ 1918.590112] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1918.590155] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1918.592281] ata5.00: configured for UDMA/133
[ 1918.592297] ata5.00: device reported invalid CHS sector 0
[ 1918.592307] ata5.00: device reported invalid CHS sector 0
[ 1918.592322] ata5: EH complete
[ 1918.592804] ata6.00: configured for UDMA/133
[ 1918.592818] ata6.00: device reported invalid CHS sector 0
[ 1918.592827] ata6.00: device reported invalid CHS sector 0
[ 1918.592841] ata6: EH complete
But the format successfully completes.
Is there an important difference if the controller are onboard or connected via PCIe slot?
I'll try some more SATA controllers on monday.
In the meanwhile I'll check the ram with memtest86+ as suggested from Lee Mathers.
Have a nice weekend.
Lars
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists