[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.63.0709122215100.16468@trinity.phys.uwm.edu>
Date: Wed, 12 Sep 2007 22:37:56 -0500 (CDT)
From: Bruce Allen <ballen@...vity.phys.uwm.edu>
To: Alan Cox <alan@...rguk.ukuu.org.uk>,
"linux-os (Dick Johnson)" <linux-os@...logic.com>,
Robert Hancock <hancockr@...w.ca>
cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Bruce Allen <bruce.allen@....mpg.de>
Subject: Re: ECC and DMA to/from disk controllers
Alan, Robert, Dick,
Thank you all for the informed and helpful response!
Alan, I'll pass your comments on to Peter Kelemen. Not sure if he follows
LKML. I think he'll be interested in your characterization of the error
types. I'll point him to the thread. (I think Peter and his
collaborators are fairly aware of the undetected error rates in standard
ethernet TCP/IP traffic which as I recall is about one undetected
single-bit error per 4TB transfered. I am pretty sure they have ruled
this out since they have checksums computed after any network transfers.)
Robert, Dick, if I have understood correctly, in response to my specific
question, RAID controllers on PCI cards will DMA data into memory over a
PCI bus using one parity bit per 32 data bits for protection. This does
provide some protection against errors in the data transfer, but much less
protection than typical RAM ECC which has one ECC byte for each eight data
bytes. As I recall, many older motherboards disabled parity on the PCI
bus, so even this protection may be inactive in many cases. From a few
minutes of on-line research, I have the impression that PCI-e has better
ECC protection against address/data errors than PCI but I am not certain.
Thanks again!
Cheers,
Bruce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists