[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3ae3aa420808061433i3d90c3dcgfb40d953da2941c8@mail.gmail.com>
Date: Wed, 6 Aug 2008 16:33:04 -0500
From: "Linas Vepstas" <linasvepstas@...il.com>
To: "Alan Cox" <alan@...rguk.ukuu.org.uk>,
"Martin K. Petersen" <martin.petersen@...cle.com>
Cc: "John Stoffel" <john@...ffel.org>,
"Alistair John Strachan" <alistair@...zero.co.uk>,
linux-kernel@...r.kernel.org
Subject: Re: amd64 sata_nv (massive) memory corruption
2008/8/5 Alan Cox <alan@...rguk.ukuu.org.uk>:
>> I'm game. Care to guide me through? So: on every write, this
>> new device mapper module computes a checksum and stores
>> it somewhere. On every read, it computes a checksum and
>> compares to the stored value. Easy enough I guess.
>>
>> Several hard parts:
>> -- where to store the checksums?
>
> That is the million dollar question - plus you can argue it is the fs
> that should do it. There is stuff crawling through the standards world to
> provide a small per block additional info area on disk sectors.
My objection to fs-layer checksums (e.g. in some user-space
file system) is that it doesn't leverage the extra info that RAID
has. If a block is bad, RAID can probably fetch another one
that is good. You can't do this at the file-system level.
I assume I can layer device-mappers anywhere, right?
Layering one *underneath* md-raid would allow it to
reject/discard bad blocks, and then let the raid layer
try to find a good block somewhere else.
I assume that a device mapper can alter the number
of blocks-in to the number of blocks-out; that it doesn't
have to be 1-1. Then for every 10 sectors of data, it
would use 11 sectors of storage, one holding the
checksum. I'm very naive about how the block layer
works, so I don't know what snags there might be.
The downside of this is that the disk wouldn't be
naively readable unless the specific mapper module
was in place -- so one would need a superblock of
some sort indicating the type of checksumming used,
etc. Is there any "standardized" way of managing
superblocks for use by the device mapper? I guess
the encrypting dm has to store meta-information
somewhere, too, specifying what kind of encryption
was used. I'll look at that.
> Yes. If you can figure out where to keep the checksums without ruining
> performance
Heh. Unlikely. The act of checksumming will impact
performance. It should end up similar to the impact
from encryption (maybe not quite as bad), or comparable
to raid-5 (which computes various kinds of parity).
> (and of course if there isn't one lurking in device mapper
> world not yet submitted).
I'm googling, but I don't see anything. However, I now see,
for the first time, pending workd for 2.6.27 for a field in bio
called "blk_integrity". I cannot figure out if this work requires
special-whiz-bang disk drives to be purchased.
Also, it seems to be limited to 8 bytes of checksums per 512
byte block? This is reasonable for checksumming, I guess,
but one could get even fancier and run ECC-type sums, if
one could store, say, an addtional 50 bytes for every 512
bytes. I'm cc'ing Martin Petersen, the developer, for
comments.
--linas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists