[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090112074846.GJ8071@disturbed>
Date: Mon, 12 Jan 2009 18:48:46 +1100
From: Dave Chinner <david@...morbit.com>
To: Jamie Lokier <jamie@...reable.org>
Cc: Arjan van de Ven <arjan@...ux.intel.com>,
Dave Kleikamp <shaggy@...ux.vnet.ibm.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Grissiom <chaos.proton@...il.com>, linux-kernel@...r.kernel.org,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH] async: Don't call async_synchronize_full_special()
while holding sb_lock
On Mon, Jan 12, 2009 at 02:31:38AM +0000, Jamie Lokier wrote:
> Arjan van de Ven wrote:
> > > - removing a million files and queuing all of the
> > > deletes in the async queues....
> >
> > the async code throttles at 32k outstanding.
> > Yes 32K is arbitrary, but if you delete a million files fast, all but the
> > first few thousand are
> > synchronous.
>
> Hmm.
>
> If I call unlink() a thousand times and then call fsync() on the
> parent directories covering files I've unlinked... I expect the
> deletes to be committed to disk when the last fsync() has returned. I
> require that a crash and restart will not see the files. Several
> kinds of transactional software and even some shell scripts expect this.
>
> Will these asynchronous deletes break the guaranteed
> commit-of-the-delete provided by fsync() on the parent directory?
It depends on the implementation of the filesystem you are running
on.
On XFS, it won't break anything because it uses a two-phase unlink.
The unlink() syscall removes the inode from the namespace
transactionally which means that even if you crash after the
directory fsync then they will never, ever appear in the directory
after recovery. In fact, recovery will see all those inodes as
existing on the unlinked lists and hence the execute the second
phase of the unlink during recovery.
I have no idea what the behaviour of other filesystems will
be but it needs to be evaluated on a filesystem-by-filesystem
basis.
Hmmmm. That introduces another interesting question w.r.t async
unlink in XFS - it's going to hammer the on-disk unlinked list hash
tables in XFS. The hashes are only 64 chains wide (per AG) so
pumping thousands of inodes into them is going to increase the
minimum memory footprint of XFS during rm -rf operations
substantially.
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists