[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080511164214.GA8091@skywalker>
Date: Sun, 11 May 2008 22:12:14 +0530
From: "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>
To: Christoph Hellwig <hch@...radead.org>
Cc: Matti Aarnio <matti.aarnio@...iler.org>,
Morten Welinder <mwelinder@...il.com>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: Deleting large files
On Sun, May 11, 2008 at 07:16:53AM -0400, Christoph Hellwig wrote:
> On Thu, May 08, 2008 at 11:19:06AM +0300, Matti Aarnio wrote:
> > This very question has troubled SQUID developers. Whatever the system, unlink()
> > that really does free diskspace does so with unbound timelimit and in services
> > where one millisecond is long wait time, the solution has been to run separate
> > subprocess that actually does the unlinks.
> >
> > Squid is not threaded software, and it was created long ago when threads were
> > rare and implementations were different in subtle details --> no threads at all.
>
> I'd call long times for the final unlink a bug in the filesystem.
> There's not all that much to do when deleting a file. What you need to
> do is basically return the allocated space to the free space allocator
> and mark the inode as unused and return it to the inode allocator. The
> first one may take quite a while with a indirect block scheme, but with
> an extent based filesystem it shouldn't be a problem. The latter
> shouldn't take too long either, and with a journaling filesystem it's
> even easier because you can intent-log the inode deletion first and then
> perform it later e.g. as part of a batched write-back of the inode
> cluster.
The problem with journalling file system like ext3 is that the credits
available in the journal may not be sufficient for full truncate. In
that case we will have to commit the journal. And that means we will
have to zero fill some of the indirect blocks so that when the
transaction is committed the inode format is a valid one.
For ext3 there are patches from abhishek that actually speed up
meta-data intensive operation. Eric Sandeen did some measurements
here.
http://people.redhat.com/esandeen/rm_test/
I have patches for Ext4 based on top of the new block allocator for
Ext4. There is some improvment with Ext3 mode.
http://www.radian.org/~kvaneesh/ext4/meta-group/
-aneesh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists