[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18580.48861.657366.629904@stoffel.org>
Date: Sat, 2 Aug 2008 16:09:01 -0400
From: "John Stoffel" <john@...ffel.org>
To: linasvepstas@...il.com
Cc: "Alistair John Strachan" <alistair@...zero.co.uk>,
linux-kernel@...r.kernel.org
Subject: Re: amd64 sata_nv (massive) memory corruption
>>>>> "Linas" == Linas Vepstas <linasvepstas@...il.com> writes:
Linas> 2008/8/1 Alistair John Strachan <alistair@...zero.co.uk>:
>> On Friday 01 August 2008 18:30:34 Linas Vepstas wrote:
>>> Hi,
>>>
>>> I'm seeing strong, easily reproducible (and silent) corruption on a
>>> sata-attached
>>> disk drive on an amd64 board. It might be the disk itself, but I
>>> doubt it; googling
>>> suggests that its somehow iommu-related but I cannot confirm this.
>>
>> Nowhere do you explicitly say you have memtest86'ed the RAM.
Linas> It passes memtest86+ just fine. The system has been in heavy
Linas> use doing big science calculations on big datasets (multi-gigabyte)
Linas> for months; these do not get corrupted when copied/moved around
Linas> on the old parallel IDE disk, nor moving/copying on an NFS mount
Linas> to a file server. Only the SATA disk is misbehaving.
Can you post the output of dmesg after a boot, so we can see which
driver is being used? I assume the new Libata stuff, but maybe you
can also turn on debugging in there as well. Stuff like SCSI_DEBUG
(in the SCSI menus) might show us more details here.
Also, have you tried a new SATA cable by any chance? That's obviously
the cheaper path than getting a new disk...
Good luck,
John
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists