[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.1107011206580.4168@dhcp-27-109.brq.redhat.com>
Date: Fri, 1 Jul 2011 12:09:48 +0200 (CEST)
From: Lukas Czerner <lczerner@...hat.com>
To: Tao Ma <tm@....ma>
cc: linux-ext4@...r.kernel.org, tytso@....edu
Subject: Re: [PATCH 4/4] ext4: Speed up FITRIM by recording flags in
ext4_group_info.
On Thu, 30 Jun 2011, Tao Ma wrote:
> From: Tao Ma <boyu.mt@...bao.com>
>
> In ext4, when FITRIM is called every time, we iterate all the
> groups and do trim one by one. It is a bit time wasting if the
> group has been trimmed and there is no change since the last
> trim.
>
> So this patch adds a new flag in ext4_group_info->bb_state to
> indicate that the group has been trimmed, and it will be cleared
> if some blocks is freed(in release_blocks_on_commit). Another
> trim_minlen is added in ext4_sb_info to record the last minlen
> we use to trim the volume, so that if the caller provide a small
> one, we will go on the trim regardless of the bb_state.
>
> A simple test with my intel x25m ssd:
> df -h shows:
> /dev/sdb1 40G 21G 17G 56% /mnt/ext4
> Block size: 4096
>
> run the FITRIM with the following parameter:
> range.start = 0;
> range.len = UINT64_MAX;
> range.minlen = 1048576;
>
> without the patch:
> [root@...u-tm linux-2.6]# time ./ftrim /mnt/ext4/a
> real 0m5.505s
> user 0m0.000s
> sys 0m1.224s
> [root@...u-tm linux-2.6]# time ./ftrim /mnt/ext4/a
> real 0m5.359s
> user 0m0.000s
> sys 0m1.178s
> [root@...u-tm linux-2.6]# time ./ftrim /mnt/ext4/a
> real 0m5.228s
> user 0m0.000s
> sys 0m1.151s
>
> with the patch:
> [root@...u-tm linux-2.6]# time ./ftrim /mnt/ext4/a
> real 0m5.625s
> user 0m0.000s
> sys 0m1.269s
> [root@...u-tm linux-2.6]# time ./ftrim /mnt/ext4/a
> real 0m0.002s
> user 0m0.000s
> sys 0m0.001s
> [root@...u-tm linux-2.6]# time ./ftrim /mnt/ext4/a
> real 0m0.002s
> user 0m0.000s
> sys 0m0.001s
>
> A big improvement for the 2nd and 3rd run.
>
> Even after I delete some big image files, it is still much
> faster than iterating the whole disk.
>
> [root@...u-tm test]# time ./ftrim /mnt/ext4/a
> real 0m1.217s
> user 0m0.000s
> sys 0m0.196s
Thanks for the patch Tao, I have couple of comments bellow.
>
> Reviewed-by: Andreas Dilger <adilger.kernel@...ger.ca>
> Signed-off-by: Tao Ma <boyu.mt@...bao.com>
> ---
> fs/ext4/ext4.h | 13 ++++++++++++-
> fs/ext4/mballoc.c | 20 ++++++++++++++++++++
> fs/ext4/super.c | 2 ++
> 3 files changed, 34 insertions(+), 1 deletions(-)
>
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 1921392..5878a22 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1214,6 +1214,9 @@ struct ext4_sb_info {
>
> /* Kernel thread for multiple mount protection */
> struct task_struct *s_mmp_tsk;
> +
> + /* record the last minlen when FITRIM is called. */
> + atomic_t s_last_trim_minblks;
> };
>
> static inline struct ext4_sb_info *EXT4_SB(struct super_block *sb)
> @@ -2067,11 +2070,19 @@ struct ext4_group_info {
> * 5 free 8-block regions. */
> };
>
> -#define EXT4_GROUP_INFO_NEED_INIT_BIT 0
> +#define EXT4_GROUP_INFO_NEED_INIT_BIT 0
> +#define EXT4_GROUP_INFO_WAS_TRIMMED_BIT 1
>
> #define EXT4_MB_GRP_NEED_INIT(grp) \
> (test_bit(EXT4_GROUP_INFO_NEED_INIT_BIT, &((grp)->bb_state)))
>
> +#define EXT4_MB_GRP_WAS_TRIMMED(grp) \
> + (test_bit(EXT4_GROUP_INFO_WAS_TRIMMED_BIT, &((grp)->bb_state)))
> +#define EXT4_MB_GRP_SET_TRIMMED(grp) \
> + (set_bit(EXT4_GROUP_INFO_WAS_TRIMMED_BIT, &((grp)->bb_state)))
> +#define EXT4_MB_GRP_CLEAR_TRIMMED(grp) \
> + (clear_bit(EXT4_GROUP_INFO_WAS_TRIMMED_BIT, &((grp)->bb_state)))
> +
> #define EXT4_MAX_CONTENTION 8
> #define EXT4_CONTENTION_THRESHOLD 2
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index a822f2a..afdd869 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2628,6 +2628,15 @@ static void release_blocks_on_commit(journal_t *journal, transaction_t *txn)
> rb_erase(&entry->node, &(db->bb_free_root));
> mb_free_blocks(NULL, &e4b, entry->start_blk, entry->count);
>
> + /*
> + * Clear the trimmed flag for the group so that the next
> + * ext4_trim_fs can trim it.
> + * If the volume is mounted with -o discard, online discard
> + * is supported and the free blocks will be trimmed online.
> + */
> + if (!test_opt(sb, DISCARD))
> + EXT4_MB_GRP_CLEAR_TRIMMED(db);
If the online discard failed we should probably clear the flag as well
right ? However I am not sure how helpful it would be since it'll most
likely fail again. Maybe we do not need to bother with that case at all,
what do you think ?
> +
> if (!db->bb_free_root.rb_node) {
> /* No more items in the per group rb tree
> * balance refcounts from ext4_mb_free_metadata()
> @@ -4840,6 +4849,10 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
> bitmap = e4b.bd_bitmap;
>
> ext4_lock_group(sb, group);
> + if (EXT4_MB_GRP_WAS_TRIMMED(e4b.bd_info) &&
> + minblocks >= atomic_read(&EXT4_SB(sb)->s_last_trim_minblks))
> + goto out;
> +
> start = (e4b.bd_info->bb_first_free > start) ?
> e4b.bd_info->bb_first_free : start;
>
> @@ -4871,6 +4884,10 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
> if ((e4b.bd_info->bb_free - free_count) < minblocks)
> break;
> }
> +
> + if (!ret)
> + EXT4_MB_GRP_SET_TRIMMED(e4b.bd_info);
> +out:
> ext4_unlock_group(sb, group);
> ext4_mb_unload_buddy(&e4b);
>
> @@ -4960,6 +4977,9 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range)
> }
> range->len = trimmed * sb->s_blocksize;
>
> + if (!ret)
> + atomic_set(&EXT4_SB(sb)->s_last_trim_minblks, minlen);
> +
> out:
> return ret;
> }
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 9ea71aa..080e9f9 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -3086,6 +3086,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
> sbi->s_sectors_written_start =
> part_stat_read(sb->s_bdev->bd_part, sectors[1]);
>
> + atomic_set(&sbi->s_last_trim_minblks, 0);
> +
I am not sure that this is needed since sbi is allocated with kzalloc,
so it should be zero already.
> /* Cleanup superblock name */
> for (cp = sb->s_id; (cp = strchr(cp, '/'));)
> *cp = '!';
>
Thanks!
-Lukas
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists