[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20070629143818.9f4ac7d7.akpm@linux-foundation.org>
Date: Fri, 29 Jun 2007 14:38:18 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Theodore Tso <tytso@....edu>
Cc: Andreas Dilger <adilger@...sterfs.com>,
Mike Waychison <mikew@...gle.com>,
Sreenivasa Busam <sreenivasac@...gle.com>,
"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: fallocate support for bitmap-based files
On Fri, 29 Jun 2007 16:55:25 -0400
Theodore Tso <tytso@....edu> wrote:
> On Fri, Jun 29, 2007 at 01:01:20PM -0700, Andrew Morton wrote:
> >
> > Guys, Mike and Sreenivasa at google are looking into implementing
> > fallocate() on ext2. Of course, any such implementation could and should
> > also be portable to ext3 and ext4 bitmapped files.
>
> What's the eventual goal of this work? Would it be for mainline use,
> or just something that would be used internally at Google?
Mainline, preferably.
> I'm not
> particularly ennthused about supporting two ways of doing fallocate();
> one for ext4 and one for bitmap-based files in ext2/3/4. Is the
> benefit reallyworth it?
umm, it's worth it if you don't want to wear the overhead of journalling,
and/or if you don't want to wait on the, err, rather slow progress of ext4.
> What I would suggest, which would make much easier, is to make this be
> an incompatible extensions (which you as you point out is needed for
> security reasons anyway) and then steal the high bit from the block
> number field to indicate whether or not the block has been initialized
> or not. That way you don't end up having to seek to a potentially
> distant part of the disk to check out the bitmap. Also, you don't
> have to worry about how to recover if the "block initialized bitmap"
> inode gets smashed.
>
> The downside is that it reduces the maximum size of the filesystem
> supported by ext2 by a factor of two. But, there are at least two
> patch series floating about that promise to allow filesystem block
> sizes > than PAGE_SIZE which would allow you to recover the maximum
> size supported by the filesytem.
>
> Furthermore, I suspect (especially after listening to a very fasting
> Usenix Invited Talk by Jeffery Dean, a fellow from Google two weeks
> ago) that for many of Google's workloads, using a filesystem blocksize
> of 16K or 32K might not be a bad thing in any case.
>
> It would be a lot simpler....
>
Hadn't thought of that.
Also, it's unclear to me why google is going this way rather than using
(perhaps suitably-tweaked) ext2 reservations code.
Because the stock ext2 block allcoator sucks big-time.
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists