[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1165969238.3771.34.camel@dyn9047017103.beaverton.ibm.com>
Date: Tue, 12 Dec 2006 16:20:38 -0800
From: Mingming Cao <cmm@...ibm.com>
To: "Amit K. Arora" <aarora@...ux.vnet.ibm.com>
Cc: linux-ext4@...r.kernel.org, suparna@...ibm.com, suzuki@...ibm.com
Subject: Re: [RFC][Patch 1/1] Persistent preallocation in ext4
On Tue, 2006-12-12 at 11:53 +0530, Amit K. Arora wrote:
> Hi Mingming,
>
Hi Amit,
> On Mon, Dec 11, 2006 at 05:28:15PM -0800, Mingming Cao wrote:
> > On Wed, 2006-12-06 at 11:28 +0530, Amit K. Arora wrote:
> >
> > > @@ -1142,13 +1155,22 @@
> > > /* try to insert block into found extent and return */
> > > if (ex && ext4_can_extents_be_merged(inode, ex, newext)) {
> > > ext_debug("append %d block to %d:%d (from %llu)\n",
> > > - le16_to_cpu(newext->ee_len),
> > > + ext4_ext_get_actual_len(newext),
> > > le32_to_cpu(ex->ee_block),
> > > - le16_to_cpu(ex->ee_len), ext_pblock(ex));
> > > + ext4_ext_get_actual_len(ex), ext_pblock(ex));
> > > if ((err = ext4_ext_get_access(handle, inode, path + depth)))
> > > return err;
> > > - ex->ee_len = cpu_to_le16(le16_to_cpu(ex->ee_len)
> > > - + le16_to_cpu(newext->ee_len));
> > > +
> > > + /* ext4_can_extents_be_merged should have checked that either
> > > + * both extents are uninitialized, or both aren't. Thus we
> > > + * need to check any of them here.
> > > + */
> > > + if (ext4_ext_is_uninitialized(ex))
> > > + uninitialized = 1;
> > > + ex->ee_len = cpu_to_le16(ext4_ext_get_actual_len(ex)
> > > + + ext4_ext_get_actual_len(newext));
>
> Above line will "remove" the uninitialized bit from "ex", if it was set.
> We call ext4_ext_get_actual_len() to get the "actual" lengths of the two
> extents, which removes this MSB in ee_len (MSB in ee_len is used to mark
> an extent uninitialized). Now, we do this because if lengths of two
> uninitialized extents will be added as it is (i.e. without masking out
> the MSB in the length), it will result in removing the MSB in ee_len.
> For example, 0x8002 + 0x8003 => 0x10005 => 0x5 (since ee_len is 16 bit).
>
> That is why just before this line, we save the "state" of this extent,
> whether it was uninitialized or not. And, we restore this "state" below.
Ah...you are right:)
More comments below...
> Index: linux-2.6.19/fs/ext4/ioctl.c
> ===================================================================
> --- linux-2.6.19.orig/fs/ext4/ioctl.c 2006-12-06 10:18:12.000000000 +0530
> +++ linux-2.6.19/fs/ext4/ioctl.c 2006-12-06 10:18:15.000000000 +0530
> @@ -248,6 +248,47 @@
> return err;
> }
>
> + case EXT4_IOC_PREALLOCATE: {
> + struct ext4_falloc_input input;
> + handle_t *handle;
> + ext4_fsblk_t block, max_blocks;
> + int ret = 0;
> + struct buffer_head map_bh;
> + unsigned int blkbits = inode->i_blkbits;
> +
> + if (IS_RDONLY(inode))
> + return -EROFS;
> +
> + if (copy_from_user(&input,
> + (struct ext4_falloc_input __user *) arg, sizeof(input)))
> + return -EFAULT;
> +
> + if (input.len == 0)
> + return -EINVAL;
> +
> + if (!(EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL))
> + return -ENOTTY;
>
Supporting preallocation for extent based files seems fairly
straightforward. I agree we should look at this first. After get this
done, it probably worth re-consider whether to support preallocation for
non-extent based files on ext4. I could imagine user upgrade from ext3
to ext4, and expecting to use preallocation on those existing files....
> +
> + block = EXT4_BLOCK_ALIGN(input.offset, blkbits) >> blkbits;
> + max_blocks = EXT4_BLOCK_ALIGN(input.len, blkbits) >> blkbits;
> + handle=ext4_journal_start(inode,
> + EXT4_DATA_TRANS_BLOCKS(inode->i_sb)+max_blocks);
> + if (IS_ERR(handle))
> + return PTR_ERR(handle);
> + while(ret>=0 && ret<max_blocks)
> + {
> + block = block + ret;
> + max_blocks = max_blocks - ret;
> + ret = ext4_ext_get_blocks(handle, inode, block,
> + max_blocks, &map_bh,
> + EXT4_CREATE_UNINITIALIZED_EXT, 1);
Since the interface takes offset and number of blocks to allocate, I
assuming we are going to handle holes in preallocation, thus, we cannot
not mark the extend_size flag to 1 when calling ext4_ext_get_blocks.
I think we should update i_size and i_disksize after preallocation. Oh,
to protect parallel updating i_size, we have to take i_mutex down.
> + }
> + ext4_mark_inode_dirty(handle, inode);
> + ext4_journal_stop(handle);
> +
Error code returned by ext4_journal_stop() is being ignored here, is
this right?
Well, there are other places in ext34/ioctl.c which ignore the return
returned by ext4_journal_stop(), maybe should fix this in a separate
patch.
> + return ret>0?0:ret;
> + }
>
>
Oh, what if we failed to allocate the full amount of blocks? i.e, the
ext4_ext_get_blocks() returns -ENOSPC error and exit the loop early. Are
we going to return error, or try something like
if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
goto retry
I wonder it might be useful to return the actual number of blocks
preallocated back to the application.
Mingming
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists