linux-ext4 - Re: ext2/3 create large filesystem takes too much time; solutions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20060915212034.GB11237@thunk.org>
Date:	Fri, 15 Sep 2006 17:20:34 -0400
From:	Theodore Tso <tytso@....edu>
To:	Pavel Mironchik <tibor0@...il.com>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: ext2/3 create large filesystem takes too much time; solutions

On Tue, Sep 12, 2006 at 02:07:34PM +0300, Pavel Mironchik wrote:
> 
> Ext2/3 does erase of inode tables, when do creation of new systems.
> This is very very long operation when the target file system volume is more 
> than
> 2Tb. Other filesystem are not affected by such huge delay on creation of
> filesystem. My concern was to improve design of ext3 to decrease time
> consuption of creation large ext3 volumes on storage servers.
> In general to solve problem, we should defer job of cleaning nodes to
> kernel. In e2fsprogs there is LAZY_BG options but it  just avoids doing
> erase of inodes only.

Hi Pavel,

	Apologies that no one responded right away; I think a lot of
people have been incredibly busy.  I've been doing a huge amount of
travel myself personally, and so my e-mail latency has been larger
than normal.

	The problem of long mke2fs problems is one that we've
considered, and we do want to do something with it, but it's not been
as high priority as some of the other problems on our hit list.
Certainly, given that inode space is very precious, I'm not convinced
that breaking backwards compatibility and burning an extra 16 bytes
per inode is worth the net gain --- although there are other solutions
that don't have that particular cost.  Yes, they take more lines of
code to support, but given the hopefully large number of people that
will be using ext4, I'd must rather spend an extra amount of
development time getting it right, than doing something fast and dirty
which then everyone pays for, over and over, again and again and again
across millions and millions of machines!


> I see several solutions for that problem:
> 1) Add special bitmaps into fs header (inode groups descriptors?).
> By looking at those bitmaps kernel could determine if inode is not cleaned, 
> and that inode will be propertly initialized.

Actually, you don't need a bitmap; a much simpler solution is to have
an integer field in the block group descriptors which indicates the
number of inods that have been initialized in that block group.  The
problem though is that what if the block group descriptors (or the
bitmaps) get corrupted?  So what we also want to do is to add support
for checksums in the individual inodes and in the block group
descriptors themselves, as a double-check.   

These are useful features in and of themselves, and there are some
sample implementations of them (for example, in the Iron ext2 paper).
So my thinking is that we should get that work into ext4, and then
it's not hard to add the support for fields in the block group
descriptors that would allow for fast mke2fs support.

Regards,

						- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html