[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20110823023504.GT20655@tux1.beaverton.ibm.com>
Date: Mon, 22 Aug 2011 19:35:04 -0700
From: "Darrick J. Wong" <djwong@...ibm.com>
To: Andreas Dilger <adilger.kernel@...ger.ca>
Cc: "Theodore Ts'o" <tytso@....edu>,
linux-fsdevel Devel <linux-fsdevel@...r.kernel.org>,
linux-ext4 List <linux-ext4@...r.kernel.org>,
Sunil Mushran <sunil.mushran@...cle.com>,
Joel Becker <jlbec@...lplan.org>,
Mingming Cao <cmm@...ibm.com>,
Amir Goldstein <amir73il@...il.com>,
Coly Li <colyli@...il.com>, Andi Kleen <andi@...stfloor.org>
Subject: Re: [RFC] ext4 metadata checksumming design
On Mon, Aug 22, 2011 at 12:11:25PM -0600, Andreas Dilger wrote:
> On 2011-08-16, at 9:25 PM, Darrick J. Wong wrote:
> > I've created a page on the ext4 wiki outlining the patchset that I'm working on
> > to add metadata checksumming to ext4. The page can be found at this address:
> > https://ext4.wiki.kernel.org/index.php/Ext4_Metadata_Checksums
>
> Darrick,
> I just had a look though this document, and it looks pretty good. It does
> need to be updated to reflect that the inode checksum now covers the full
> inode size, which is already mentioned in the "Extended Attributes" section.
Updated; thank you.
> > For the most part, the metadata objects in ext4 actually have enough space to
> > squeeze in a 32-bit checksum; it was trivially easy to find a spot in the
> > superblock, the extent tree, extended attribute blocks, and the inode. Those
> > pieces are already done and in my tree, but the patchset as a whole is being
> > held up by the second class of metadata objects.
>
> For the group descriptor checksum and inode/block bitmap checksums with
> 32-byte group descriptors it makes sense to truncate the CRC32c checksum
> and store the low bits of the checksum in the existing 16-bit fields, and
> the high bits in extended 16-bit fields.
One thing I haven't had the time to do yet is run that monte carlo simulation
that Ted suggested to find out how painful it is to cut off half of a crc32.
Do you know of anyone who has? (Or for that matter knows anything about my
half-baked idea to crc16(crc32(bitmap))?)
> As a follow on, it probably also makes sense to test with a < 2^32 block
> filesystem with a 64-byte group descriptor. That would give enough room
> for 32-bit checksums even on smaller filesystems, and would also help
> facilitate resizing filesystems from < 2^32 blocks to > 2^32 blocks in
> the future. That _may_ just be as easy as formatting with "-O 64bit"
> on a < 2^32 block filesystem, but I don't know how much that has been
> tested.
I've been testing it. I haven't seen any problems _so_ far.... :)
Thank you for the review!
--D
>
> > That second class of objects are the ones that required a bit of work:
> >
> > - Directory blocks have an "unused" 12-byte directory entry at the very end of
> > the block; 8 bytes of header are followed by a 32-bit checksum. This can be
> > taken care of as part of directory rebuilding in e2fsck/rehash.c.
> >
> > - HTree blocks had to have the dx_entry limit reduced by 1 to accomodate a
> > checksum. This is also taken care of during e2fsck directory rebuild.
> >
> > - Extended attribute blocks that are stored in the inode table -- the h_magic
> > field is written by the kernel, but neither the kernel nor e2fsprogs ever
> > actually read this field. The field could be reused to checksum the extra
> > space since (as far as I can tell) EAs are the only user of that empty space.
> >
> > Other miscellany:
> >
> > - e2fsprogs had to be converted to always work with ext2_inode_large.
> >
> > - Various bugs in the htree code....
> >
> > I hope to have a first draft of the kernel/e2fsprogs patches out on the mailing
> > list in a week or two, or at least before LPC next month. Still on my todo
> > list is superblocks, EAs, changing the jbd2 checksum, and rigorous testing on
> > powerpc.
> >
> > Please have a look at the design document and please feel free to suggest any
> > changes.
> >
> > --D
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists