[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090821143223.GA18008@gradator.net>
Date: Fri, 21 Aug 2009 16:32:23 +0200
From: Sylvain Rochet <gradator@...dator.net>
To: Daniel J Blueman <daniel.blueman@...il.com>
Cc: Linux Kernel <linux-kernel@...r.kernel.org>,
Sylvain Rochet <gradator@...dator.net>
Subject: Re: 2.6.28.9: EXT3/NFS inodes corruption
Hi,
On Fri, Aug 21, 2009 at 12:05:10PM +0100, Daniel J Blueman wrote:
>
> The reason I ask, I was chasing data corruption across the PCIe bus
> with some high-performance Quadrics interconnect adapters a while ago.
> The reproducer involved multiple outstanding main memory read requests
> to related addresses and a small block of data would be returned from
> the wrong offset.
>
> In the end, I found the nVidia CK804 (also MCP55) HT->PCIe bridge was
> at fault and later found disk corruption when doing heavy rsyncs to
> network. This was never publicly acknowledged, but I guess it
> illustrates the need for some micro-tests to verify data-soundness
> under duress; it took a day (and petabytes of data) of the production
> I/O workload to get this data corruption, and 3 seconds with the right
> reproducer, (still non-trivial to catch on a PCIe protocol analyser).
>
> Sometime I'll develop a stress-test driver for a common SATA or
> network controller to drive it's DMA engine with I/O patterns to and
> from main memory, checking the data integrity every few seconds; this
> could be generalised with OpenGL nicely for graphics cards on
> workstations I imagine.
Hehe, sounds interesting.
Sylvain
Download attachment "signature.asc" of type "application/pgp-signature" (190 bytes)
Powered by blists - more mailing lists