[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5df78e1d0908311240s3205b4bcrb65b2552b4ed579c@mail.gmail.com>
Date: Mon, 31 Aug 2009 12:40:29 -0700
From: Jiaying Zhang <jiayingz@...gle.com>
To: Theodore Tso <tytso@....edu>
Cc: Andreas Dilger <adilger@....com>,
Frank Mayhar <fmayhar@...gle.com>,
Eric Sandeen <sandeen@...hat.com>,
Curt Wohlgemuth <curtw@...gle.com>,
ext4 development <linux-ext4@...r.kernel.org>
Subject: Re: Question on fallocate/ftruncate sequence
Thanks a lot for the comments and suggestions!
On Sat, Aug 29, 2009 at 7:52 PM, Theodore Tso<tytso@....edu> wrote:
> On Fri, Aug 28, 2009 at 05:40:54PM -0700, Jiaying Zhang wrote:
>> --- .pc/fallocate_keepsizse.patch/fs/attr.c 2009-08-28 15:38:46.000000000 -0700
>> +++ fs/attr.c 2009-08-28 17:01:04.000000000 -0700
>> @@ -68,7 +68,8 @@ int inode_setattr(struct inode * inode,
>> unsigned int ia_valid = attr->ia_valid;
>>
>> if (ia_valid & ATTR_SIZE &&
>> - (attr->ia_size != i_size_read(inode)) {
>> + (attr->ia_size != i_size_read(inode) ||
>> + (inode->i_flags & FS_KEEPSIZE_FL))) {
>> int error = vmtruncate(inode, attr->ia_size);
>> if (error)
>> return error;
>
> Instead of doing this in the generic code, it really should be done in
> ext4_setattr. Technically speaking, we don't actually need the
> FS_KEEPSIZE_FL to solve this problem; instead we can simply have the
> ext4 code look in the extent tree to see if there are any blocks
> mapped beyond the logical block:
>
> i_size_read(inode) >> inode->i_sb->s_blocksize_bits
Is it relatively cheap to scan the extent tree? Will this add the
overhead to truncate?
>
> Having a flag as Andreas suggested does help with the issue of e2fsck
> noticing whether or not i_size is incorrect (and should be fixed) or
> the file has been extended. So keeping having the flag is an OK thing
> to do, but we need to be careful about a particularly subtle
> overloading problem. The flags FS_*_FL as defined in
> include/linux/fs.h are technically only for in-memory use. The ext4
> on-disk format flags is EXT4_*_FL, and defined in ext4.h.
>
> The flags were originially defined for use in ext2/3/4, but later on
> other filesystems adopted those flags so that e2fsprogs's chattr and
> lsattr programs could be used for their filesystems as well. It just
> so happens that for ext2/3/4 the on-disk encoding of those flags in
> the in-memory encoding of those flags in i_flags are the same, but
> that means that the flags need to be defined in both places to avoid
> assignment overlaps. We also need to be clear whether the flags are
> internal flags for ext4's use only, or flags meant for use by all
> filesystems. This is why the testing for FS_KEEPSIZE_FL in fs/attr is
> particularly bad, if the flag are going to be set in fs/ext4/extents.c.
>
> It's better to define the flag as EXT4_KEEPSIZE_FL, and to use it as
> EXT4_KEEPSIZE_FL, but make a note of that bitfield position as being
> reserved in include/linux/fs.h.
Here is the modified patch based on your suggestions. I stick with the
KEEPSIZE_FL approach that I think can allow us to handle the special
truncation accordingly during fsck. Other file systems can also re-use
this flag when they want to support fallocate with KEEP_SIZE. As you
suggested, I moved the EXT4_KEEPSIZE_FL checking to ext4_setattr
that now calls vmtruncate if the KEEPSIZE flag is set in the i_flag.
Please let me know what you think about this proposed patch.
Thanks a lot!
Jiaying
--- .pc/fallocate_keepsizse.patch/fs/ext4/extents.c 2009-08-31
12:08:10.000000000 -0700
+++ fs/ext4/extents.c 2009-08-31 12:12:16.000000000 -0700
@@ -3095,7 +3095,13 @@ static void ext4_falloc_update_inode(str
i_size_write(inode, new_size);
if (new_size > EXT4_I(inode)->i_disksize)
ext4_update_i_disksize(inode, new_size);
+ inode->i_flags &= ~EXT4_KEEPSIZE_FL;
} else {
+ /*
+ * Mark that we allocate beyond EOF so the subsequent truncate
+ * can proceed even if the new size is the same as i_size.
+ */
+ inode->i_flags |= EXT4_KEEPSIZE_FL;
}
}
--- .pc/fallocate_keepsizse.patch/fs/ext4/inode.c 2009-08-31
12:08:10.000000000 -0700
+++ fs/ext4/inode.c 2009-08-31 12:12:16.000000000 -0700
@@ -3973,6 +3973,8 @@ void ext4_truncate(struct inode *inode)
if (!ext4_can_truncate(inode))
return;
+ inode->i_flags &= ~EXT4_KEEPSIZE_FL;
+
if (inode->i_size == 0 && !test_opt(inode->i_sb, NO_AUTO_DA_ALLOC))
ei->i_state |= EXT4_STATE_DA_ALLOC_CLOSE;
@@ -4807,7 +4809,9 @@ int ext4_setattr(struct dentry *dentry,
}
if (S_ISREG(inode->i_mode) &&
- attr->ia_valid & ATTR_SIZE && attr->ia_size < inode->i_size) {
+ attr->ia_valid & ATTR_SIZE &&
+ (attr->ia_size < inode->i_size ||
+ (inode->i_flags & EXT4_KEEPSIZE_FL))) {
handle_t *handle;
handle = ext4_journal_start(inode, 3);
@@ -4838,6 +4842,11 @@ int ext4_setattr(struct dentry *dentry,
goto err_out;
}
}
+ if ((inode->i_flags & EXT4_KEEPSIZE_FL)) {
+ rc = vmtruncate(inode, attr->ia_size);
+ if (rc)
+ goto err_out;
+ }
}
rc = inode_setattr(inode, attr);
--- .pc/fallocate_keepsizse.patch/include/linux/fs.h 2009-08-31
12:08:10.000000000 -0700
+++ include/linux/fs.h 2009-08-31 12:12:16.000000000 -0700
@@ -343,6 +343,7 @@ struct inodes_stat_t {
#define FS_TOPDIR_FL 0x00020000 /* Top of directory hierarchies*/
#define FS_EXTENT_FL 0x00080000 /* Extents */
#define FS_DIRECTIO_FL 0x00100000 /* Use direct i/o */
+#define FS_KEEPSIZE_FL 0x00200000 /* Blocks allocated beyond EOF */
#define FS_RESERVED_FL 0x80000000 /* reserved for ext2 lib */
#define FS_FL_USER_VISIBLE 0x0003DFFF /* User visible flags */
--- .pc/fallocate_keepsizse.patch/fs/ext4/ext4.h 2009-08-31
12:08:10.000000000 -0700
+++ fs/ext4/ext4.h 2009-08-31 12:12:16.000000000 -0700
@@ -235,6 +235,7 @@ struct flex_groups {
#define EXT4_HUGE_FILE_FL 0x00040000 /* Set to each huge file */
#define EXT4_EXTENTS_FL 0x00080000 /* Inode uses extents */
#define EXT4_EXT_MIGRATE 0x00100000 /* Inode is migrating */
+#define EXT4_KEEPSIZE_FL 0x00200000 /* Blocks allocated beyond EOF
(bit reserved in fs.h) */
#define EXT4_RESERVED_FL 0x80000000 /* reserved for ext4 lib */
#define EXT4_FL_USER_VISIBLE 0x000BDFFF /* User visible flags */
>
> - Ted
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists