[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20070530090219.GA11957@duck.suse.cz>
Date: Wed, 30 May 2007 11:02:19 +0200
From: Jan Kara <jack@...e.cz>
To: Mingming Cao <cmm@...ibm.com>
Cc: Theodore Tso <tytso@....edu>, linux-ext4@...r.kernel.org
Subject: Re: ext2_discard_prealloc() called on each iput?
On Tue 29-05-07 14:10:52, Mingming Cao wrote:
> On Mon, 2007-05-28 at 18:04 +0200, Jan Kara wrote:
> > On Wed 23-05-07 08:37:43, Theodore Tso wrote:
> > > On Tue, May 22, 2007 at 06:11:27PM +0200, Jan Kara wrote:
> > > >
> > > > while fixing some problems with preallocation in UDF, I had a look how
> > > > ext2 solves similar problems. I found out that ext2_discard_prealloc() is
> > > > called on every iput() from ext2_put_inode(). Is it really appropriate? I
> > > > don't see a reason for doing so...
> > >
> > > I agree, it's probably not appropriate. It's been that way for a long
> > > time, though (since 2.4.20). It's not as horrible as it seems since
> > > unlike traditional Unix systems, we don't call iput() as often, since
> > > for example operations like close() end up calling dput(), which
> > > decrements the ref. count on dentry, not the inode. But it would
> > > probably be better to check to see if i_count is 1 before deciding to
> > > discard the preallocation.
> > OK, but then you could move the code to drop_inode() which is called at
> > exactly that moment... I've been thinking more about it when fixing UDF.
>
> I have tried to optimize ext2 discard preallocation code like ext3
> discard reservation a while back: we only call ext2_discard_prealloc on
> the last iput(), i.e. ext2/3_clear_inode().
>
> This patch actually made into mainline for a little while, then later it
> is being reversed back because of possible leak of preallocated blocks.
>
> Tt the unmount time, someone might still hold the reference of the
> inode, thus ext2_discard_prealloc() did not get a chance to be called.
> Since ext2 preallocation is doing pre-allocation on disk, this leads to
> leak of preallocated blocks, confused fsck later.
>
> http://lkml.org/lkml/2005/4/12/333
Interesting. I don't quite understand how it can happen that inode is not
released at umount time... It must be released for umount to succeed. There
is a slight problem that inode is not written after calling clear_inode()
which could cause some problems (i_blocks being wrong) but blocks in
bitmaps should be freed just right...
> > > anyway, so the preallocated region is less useful.
> > OK, but still we could use e.g. i_writecount to check that we drop the
> > last descriptor for writing...
> >
> Yep, that is what ext3 does in ext3_release_file(), I forget why we
> didn't fix this for ext2.
Hmm, probably we just forgot...
Honza
--
Jan Kara <jack@...e.cz>
SuSE CR Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists