[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070208092945.GA10973@duck.suse.cz>
Date: Thu, 8 Feb 2007 10:29:45 +0100
From: Jan Kara <jack@...e.cz>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Andreas Dilger <adilger@...sterfs.com>, sho@...s.nec.co.jp,
linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [RFC][PATCH 2/3] Move the file data to the new blocks
On Wed 07-02-07 12:56:59, Andrew Morton wrote:
> On Wed, 7 Feb 2007 13:46:57 -0700
> Andreas Dilger <adilger@...sterfs.com> wrote:
>
> > On Feb 06, 2007 17:35 -0800, Andrew Morton wrote:
> > > On Mon, 5 Feb 2007 14:12:04 +0100
> > > Jan Kara <jack@...e.cz> wrote:
> > > > > Move the blocks on the temporary inode to the original inode
> > > > > by a page.
> > > > > 1. Read the file data from the old blocks to the page
> > > > > 2. Move the block on the temporary inode to the original inode
> > > > > 3. Write the file data on the page into the new blocks
> > > > I have one thing - it's probably not good to use page cache for
> > > > defragmentation.
> > >
> > > Then it is no longer online defragmentation. The issues with maintaining
> > > correctness and coherency with ongoing VFS activity would be truly ghastly.
> > >
> > > If we're worried about pagecache pollution then it would be better to control
> > > that from userspace via fadvise().
> >
> > It should be possible to have the online defrag tool lock the inode against
> > any changes,
>
> Sounds easy when you say it fast. But how do we "lock" against, say, a
> read pagefault? Only by writing back then removing the pagecache page then
> reinstantiating it as a locked, not-uptodate page and then removing it from
> pagecache afterwards prior to unlocking it. Or something.
>
> I don't think we want to go there.
I though Andreas meant "any write changes" - i.e. you check that noone
has open file descriptor for writing and block any new open for writing.
That can be done quite easily.
Anyway, I agree with you that userspace solution to a possible page
cache pollution is preferable after thinking about it for a while.
As I've been thinking about it, we could actually do the copying
from user space. We could do something like:
block any writes to file (as I described above)
craft new inode with blocks allocated as we want (using preallocation,
we should mostly have the kernel infrastructure we need)
copy data using splice syscall
call the kernel to switch data
But maybe I miss something and it's more complicated than I think.
Honza
--
Jan Kara <jack@...e.cz>
SuSE CR Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists