[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <47443446-99CA-43EE-89EC-7A5B6A9F8A0E@dilger.ca>
Date: Mon, 5 Sep 2011 19:54:32 -0600
From: Andreas Dilger <adilger@...ger.ca>
To: "djwong@...ibm.com" <djwong@...ibm.com>
Cc: Andreas Dilger <adilger@...ger.ca>,
Andreas Dilger <adilger.kernel@...ger.ca>,
Theodore Tso <tytso@....edu>,
Sunil Mushran <sunil.mushran@...cle.com>,
Amir Goldstein <amir73il@...il.com>,
Andi Kleen <andi@...stfloor.org>,
Mingming Cao <cmm@...ibm.com>,
Joel Becker <jlbec@...lplan.org>,
"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>,
Coly Li <colyli@...il.com>
Subject: Re: [PATCH 10/37] mke2fs: Allow metadata checksums to be turned on at mkfs time
On 2011-09-05, at 1:20 PM, "Darrick J. Wong" <djwong@...ibm.com> wrote:
> On Sun, Sep 04, 2011 at 12:28:24PM -0600, Andreas Dilger wrote:
>> On 2011-08-31, at 6:36 PM, "Darrick J. Wong" <djwong@...ibm.com> wrote:
>>> Write out checksummed inodes even when writing out a zeroed table.
>>>
>>> + if (fs->super->s_creator_os == EXT2_OS_LINUX &&
>>> + fs->super->s_feature_ro_compat &
>>> + EXT4_FEATURE_RO_COMPAT_METADATA_CSUM) {
>>
>> Somehow it doesn't look like this is skipping the zeroing of the inode table
>> blocks if lazy itable zeroing is set.
>>
>> Any measurements on how much this slows down inode table writing (which is
>> already the slowest part of mke2fs)?
>
> Quite a lot, actually. Trouble is, if you're going to write zeroes to the
> inode table (without using uninit) then I think you need the checksums to
> match. Maybe the solution is to modify the kernel/e2fsck to ignore the
> checksum if the inode bitmap says the inode isn't in use?
The kernel is already aware of which inodes are not in use if the uninit_bg feature is enabled. Even without uninit_bg, the kernel will not read itable blocks from disk if none of the inodes in that block are used.
Also, if the lazy_itable_init is passed to mke2fs it isn't supposed to initialize the inode table at all, and the kernel should do it instead.
> A better solution is to zero the buffer, stuff in all the checksums in the
> correct places, and then write the block out.
Rather, the kernel should do it in the background.
>>> + bzero(&inode, sizeof(inode));
>>> + for (ino = fs->super->s_inodes_per_group * i;
>>> + ino < fs->super->s_inodes_per_group * (i + 1);
>>> + ino++) {
>>
>> Why recompute "ino" each time through this loop? It should be enough to
>> simply initialize it at 1 and then increment it for each inode written.
>
> Agreed.
>
> --D
>>> + if (!ino)
>>> + continue;
>>> + retval = ext2fs_write_inode(fs, ino, &inode);
>>> + if (retval) {
>>> + com_err("inode_init", retval,
>>> + "while writing inode %d\n",
>>> + ino);
>>> + exit(1);
>>> + }
>>> + }
>>> + } else {
>>> + retval = ext2fs_zero_blocks2(fs, blk, num, &blk, &num);
>>> + if (retval) {
>>> + fprintf(stderr, _("\nCould not write %d "
>>> + "blocks in inode table starting "
>>> + "at %llu: %s\n"),
>>> + num, blk, error_message(retval));
>>> + exit(1);
>>> + }
>>> }
>>> if (sync_kludge) {
>>> if (sync_kludge == 1)
>>> @@ -829,7 +851,8 @@ static __u32 ok_features[3] = {
>>> EXT4_FEATURE_RO_COMPAT_EXTRA_ISIZE|
>>> EXT2_FEATURE_RO_COMPAT_SPARSE_SUPER|
>>> EXT4_FEATURE_RO_COMPAT_GDT_CSUM|
>>> - EXT4_FEATURE_RO_COMPAT_BIGALLOC
>>> + EXT4_FEATURE_RO_COMPAT_BIGALLOC|
>>> + EXT4_FEATURE_RO_COMPAT_METADATA_CSUM
>>> };
>>>
>>>
>>>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists