[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49555797.3080009@shaw.ca>
Date: Fri, 26 Dec 2008 16:15:51 -0600
From: Robert Hancock <hancockr@...w.ca>
To: linux-kernel@...r.kernel.org
Cc: linux-kernel@...r.kernel.org
Subject: Re: RFC: detection of silent corruption via ATA long sector reads
Greg Freemyer wrote:
> All,
>
> On the mdraid list, there was a recent thread about using raid
> functionality to detect / repair silent corruption.
>
> The issues brought up were that a lot of silent data corruption occurs
> when cables, controllers, power supplies, ram, cache, etc. goes bad.
>
> It made me think about another option for detecting silent corruption
> I have not seen discussed, but maybe I missed it.
>
> Aiui, the ATA spec allows for the reading of a long sector as well as
> the normal 512 byte sector. When you get a long sector you also get
> the CRC (or whatever checksum data there is on the disk that allows
> the drive itself to detect media errors).
>
> I don't have any idea how easy or hard it would be to do, but I would
> like to see the entire block subsystem enhanced to optionally allow
> long sector reads to be used in a "paranoid" fashion.
>
> Effectively it would be:
>
> 1) Read long sector from drive: verify CRC in kernel. This tests
> most everything on the i/o path.
>
> 2) maintain CRC type information in block subsystem. Verify no
> corruption just before handing off to userspace. This would
> potentially identify CPU/cache/RAM failures.
Even if the drive supports those commands the problem is the CRC/ECC
data is in a vendor-specific format, so it couldn't be processed
generically.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists