[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46A8895A.5000308@linux.vnet.ibm.com>
Date: Thu, 26 Jul 2007 17:15:30 +0530
From: "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>
To: Theodore Tso <tytso@....edu>
CC: linux-ext4@...r.kernel.org
Subject: Re: [PATCH 3/3] e2fsprogs: Support for large inode migration.
Theodore Tso wrote:
> On Wed, Jul 25, 2007 at 11:06:28AM +0530, Aneesh Kumar K.V wrote:
>> From: Aneesh Kumar K.V <aneesh.kumar@...ux.vnet.ibm.com>
>>
>> Add new option -I <inode_size> to tune2fs.
>> This is used to change the inode size. The size
>> need to be multiple of 2 and we don't allow to
>> decrease the inode size.
>>
>> As a part of increasing the inode size we throw
>> away the free inodes in the last block group. If
>> we can't we fail. In such case one can resize the
>> file system and then try to increase the inode size.
>
> Let me guess, you're testing with a filesystem with two block groups,
> right? And to date you've tested *only* by doubling the size of the
> inode.
>
I tested this with multiple ( 1 and 7 ) groups. But yes all the testing was to change
inode size from 128 to 256.
> What your patch does is is keep the number of inode blocks per block
> group constant, so that the total number of inodes decreases by
> whatever factor the inode size is increasing. It's a cheap, dirty way
> of doing the resizing, since it avoids needing to either (a) update
> directory entries when inode numbers get renumbered, and (b) need to
> update inodes when blocks need to get relocated in order to make room
> for growing the inode table.
>
That is correct. What i was looking at was to get the dynamic inode location
first. That should help us to place large inode any where right ?. But i know
that is a long term goal since there is no patches for dynamic inode location.
I will work at increasing the inode table size as a part of increasing the inode
size.
> The problem with your patch is:
>
> * By shrinking the number of inodes, it can constrain the
> ability of the filesystem to create new files in the future.
>
I explained this in the commit log.
> * It ruins the inode and block placement algorithms where we
> try to keep inodes in the same block group as their parent
> directory, and we try to allocate blocks in the same block
> group as their containing inode.
I missed this in my analysis. So this means we may end up with bad performance
after resizing the inode. I will look at increasing the inode table size as a
part of increasing the inode size.
>
> * Because when the current patch makes no attempt to relocate
> inodes, and when it doubles the inode size, it chops the
> number of inodes in half, there must be no inodes in the
> last half of the inode table. That is if there are N block
> groups, the inode tables in blockgroups N/2 to N-1 must be
> empty. But because of the block group spreading algorithm,
> where new directories get pushed out to new block groups, in
> any real real-life filesystem, the use of block groups is
> evenly spread out, which means in practice you won't see
> case where the last half of the inodes will not be in use.
> Hence, your patch won't actually work in practice.
>
> So unfortunately, the right answer *will* require expanding the inode
> tables, and potentially moving blocks out of the way in order to make
> room for it. A lot of that machinery is in resize2fs, actually, and
> I'm wondering if the right answer is to move resize2fs's functionality
> into tune2fs. We will also need this to be able to add the resize
> inode after the fact.
>
> That's not going to be a trivial set of changes; if you're looking for
> something to test the undo manager, my suggestion would be to wire it
> up into mke2fs and/or e2fsck first. Mke2fs might be nice since it
> will give us a recovery path in case someone screws up the arguments
> to mkfs.
>
I guess Undo I/O manager can go in because I have been using it for
the ext3 -> ext4 inode migration testing and for testing the above patch.
Why would one need to recover on mkfs. He can very well run mkfs again right ?
>> tune2fs use undo I/O manager when migrating to large
>> inode. This helps in reverting the changes if end results
>> are not correct.The environment variable TUNE2FS_SCRATCH_DIR
>> is used to indicate the directory within which the tdb
>> file need to be created. The file will be named tune2fs-XXXXXX
>
> My suggestion would be to use something like /var/lib/e2fsprogs as the
> defalut directory. And we should also do some tests to make sure
> something sane happens if we run out of room for the undo file.
> Presumably the only thing we can do is to abort the run and then back
> out the chnages using what was written out to the undo file.
>
I had a FIXME!! in the code which stated it would be nice to use the conf file
But right now the conffile is e2fsck specific
+ char *tdb_dir, tdb_file[PATH_MAX];
+#if 0 /* FIXME!! */
+ /*
+ * Configuration via a conf file would be
+ * nice
+ */
+ profile_get_string(profile, "scratch_files",
+ "directory", 0, 0,
+ &tdb_dir);
+#endif
+ tdb_dir = getenv("TUNE2FS_SCRATCH_DIR");
-aneesh
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists