[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20060918195114.GA14342@schatzie.adilger.int>
Date: Mon, 18 Sep 2006 13:51:14 -0600
From: Andreas Dilger <adilger@...sterfs.com>
To: Theodore Tso <tytso@....edu>
Cc: Pavel Mironchik <tibor0@...il.com>, linux-ext4@...r.kernel.org
Subject: Re: ext2/3 create large filesystem takes too much time; solutions
On Sep 17, 2006 01:57 -0600, Andreas Dilger wrote:
> Things that need to be done:
> - the kernel block/inode allocation needs to be reworked:
> - initialize a whole block worth of inodes at one time instead
> of single inodes.
> - I don't think we need to zero out the unused inodes - the kernel
> should already be doing this if the inode block is unused
> - find a happy medium between using existing groups (inodes/blocks)
> and initializing new ones
> - we likely need to verify the checksum in more places in e2fsck before
> trusting the UNINIT flags
- need to decide what to do if UNINIT flag is set but checksum is wrong.
this has possibility of getting a LOT of garbage from the disk, including
old "valid" inodes, garbage for bitmaps, etc.
- should kernel and/or e2fsck zero the unused parts of the inode table
asynchronously to avoid such problems? It could optionally only write
out the blocks if they are not already zero (to avoid consuming space
on sparse filesystems) but this would require an additional read of each
block (maybe can be done slowly to avoid overloading system)? Could also
have another flag which indicates if group data is aready zeroed
- need to clear UNINIT flags if we detect a bitmap/inode is in use in group;
this would possibly also force a restart of e2fsck so that it checks the
whole group (with caveat for above).
- need to zero itable blocks if allocating from an UNINIT group in e2fsprogs
- need to zero ibitmap/bbitmap if using UNINIT group in e2fsprogs
- should we drop bg_itable_unused to minimum possible value on e2fsck?
this would reduce subsequent e2fsck time a bit.
- need to handle proper big endian machines in e2fsprogs when computing
checksum. kernel will always do crc on little-endian disk data, and
little endian e2fsprogs will do same.
Attached is a slightly-improved version, it at least passes "make check"
in tests, though I haven't gotten the "tst_csum" program to build & run
automatically (passes by hand).
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.
View attachment "e2fsprogs-uninit.patch" of type "text/plain" (34314 bytes)
Powered by blists - more mailing lists