[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <493F7732.1020505@rs.jp.nec.com>
Date: Wed, 10 Dec 2008 17:00:50 +0900
From: Akira Fujita <a-fujita@...jp.nec.com>
To: Theodore Tso <tytso@....edu>
CC: linux-ext4@...r.kernel.org
Subject: Re: [PATCH]ext4: online defrag: Enable to reuse blocks by multiple
defrag
Hi Ted,
Thank you for letting me know.
I think new defrag can be implemented with your proposal.
At first, I am planning to implement usual defrag (without any options)
in the following steps.
Please check whether my approach is fine.
(U:User spcace K:Kernel)
1:U Create donor inode and then unlink it.
2:U Allocate contiguous blocks to donor inode with fallocate().
3:U Call the FS_IOC_FIEMAP ioctl to get the extents information of donor inode.
And check the extents of donor inode are less than the defrag target inode's.
4:U Call the EXT4_IOC_DEFRAG ioctl to exchange the data between
target inode and donor inode.
5:K The EXT4_IOC_DEFRAG ioctl calls ext4_defrag() in kernel
(I'm going to change current ext4_defrag() to do only data exchange).
* Step 4 and 5 correspond to Ted's (3) ioctl.
6:U Close fd of donor inode.
New EXT4_IOC_DEFRAG would be implemented as followings.
#define EXT4_IOC_DEFRAG _IOW('f', 15, struct move_extent)
struct move_extent
{
int org_fd; /* file descriptor of defrag target file */
int dest_fd; /* file descriptor of donor file */
long long start; /* logical block offset of target file */
long long len; /* exchange data length in block */
}
Also defrag -r and -f options can be implemented with (1) and (2)
in your previous post. I will address them after implementing usual defrag.
Regards,
Akira Fujita
Theodore Tso wrote:
> On Tue, Dec 09, 2008 at 11:26:37AM +0900, Akira Fujita wrote:
>> I'm redesigning ext4 online defrag based on the comments from Ted.
>> Probably defrag's block allocation method will be changed greatly.
>
> Akira-san,
>
> FYI, there was a discussion about defrag on today's ext4 call. One of
> the ideas that was kicked around was to completely change the
> primitives used by defrag, and to design things around three
> primitive, general purpose interfaces.
>
> We didn't go into complete detail on the call, but let me give you a
> strawman proposal for consideration/discussion:
>
> (1) An (ioctl-based) interface which allows a privileged program to
> specify one or more range of blocks which the filesystem's block
> allocator must NOT allocate from. (We may want to have a flag for
> each block range which either makes the block lockout advisory, such
> that if the block allocator can't find blocks anywhere else, it may
> invade the reserved block area --- or mandatory, where if there are no
> other blocks, the filesystem returns ENOSPC). This allows the
> defragmenter to work on an area of the disk without worrying about
> concurrent allocations by other processes from getting in the way.
>
> (2) An (ioctl-based) interface which associates with an inode
> preferred range(s) of blocks which the block allocator will try using
> first; if those blocks are not available, or the block range(s) is
> exhausted, the block allocator use its normal algorithms to pick the
> best available block. The set of preferred blocks is only guaranteed
> to persist while the inode is in memory.
>
> (3) An (ioctl-based) interface which takes two inode numbers, and
> allows a privileged program to "defrag" an inode by using blocks from
> a donor inode and using them as the new blocks for the destination
> inode, preserving the contents of the destination inode.
>
> The advantage of this implementation strategy is that each of the
> interfaces can be implemented one at a time, with very well defined
> semantics, and which can be independently tested. The semantics can
> also be used in different combinations to solve alternate problems.
> For example, a combination of (1) and (2) can be used to reserve
> blocks for use by a directory that is expected to grow, so the
> directory can use contiguous blocks. Or, they could be used to
> implement an "online shrink" that would allow a filesystem to be
> resized to a smaller size.
>
> One other thing that comes to mind. If it turns out that these
> interfaces have multiple users, and in some cases the reservations or
> block allocation restrictions are expected to last for longer than a
> process lifetime, it may be useful to tag them with a short (8-16
> character) name, so that it is possible to list the current set of
> reservations, and so they can be removed by a privileged user. This
> could be overdesigning the interface; but the whole *point* of
> thinking about the interfaces from a more generic point of view (as
> opposed for use by a specific program for which the kernel interfaces
> are custom-designed) is that hopefully they will have multiple use
> cases and multiple users, in which case we need to worry about how
> multiple users can co-exist.
>
> Thoughts, comments?
>
> - Ted
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists