[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090724002317.GA14052@mit.edu>
Date: Thu, 23 Jul 2009 20:23:17 -0400
From: Theodore Tso <tytso@....edu>
To: Eric Sandeen <sandeen@...hat.com>
Cc: Andreas Dilger <adilger@....com>, linux-ext4@...r.kernel.org
Subject: Re: How to fix up mballoc
On Thu, Jul 23, 2009 at 12:43:47PM -0500, Eric Sandeen wrote:
> > 1) In ext4_mb_normalize_request(), if the inode that we are allocating
> > does not have any open file descriptors for write (i.e., it's already
> > closed and we're allocating via delalloc) _and_ the inode was
> > previously opened with O_CREAT and without O_APPEND (checked via a
> > flag in EXT4_I(inode)), then do not normalize the size to a power of
> > two, but rather to the filesystem blocksize.
>
> I'm sort of woefully ignorant of a lot of the mballoc stuff.
>
> When you say once a file is written that's probably the final size... do
> you mean when writes are done and it's closed, or when the first write
> to the file is complete?
>
> I think an awful lot of normal cases write to a file in sub-file-sized
> chunks (think mp3 or flac encoding, file downloading, etc).
I meant when the writes are done and the files are closed; hence my
proposal that we do this do #1 above only if there are no open file
descriptors for write. That is, if the file can be written and closed
by the userspace process before any delayed allocation blocks are
attempted to be written by the filesystem, we can probably safely
assume that the file won't grown in size later on.
> Also, I get the !O_APPEND test, but is O_CREAT necessary? I wonder how
> much of a hint that really gives us.
Well, it probably should be O_CREAT || O_TRUNC. The basic idea here is
to distinguish between a file which gets appended to via syslog, or
via a mail delivery program that writes 4k of data to the end of a
mail spool file. In some cases, such as the mail delivery program, it
might not use O_APPEND, but instead it might lock the file, seek to
end of the file, and then right the 4k worth of e-mail. So if the
file wasn't freshly created (or truncated) at the last open, maybe we
should use a more aggressive preallocation --- and in the case of
/var/mail spool delivery, perhaps the preallocation should persist
beyond the file getting closed. (In the future we might want to have
some hueristics where if we notice that the pattern of file writes is
a repeated open, write-causing-block-allocation, close, maybe we
should do some kind of block reservation style scheme while the
filesystem is mounted and the inode stays in the inode cache.)
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists