lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.1107011206580.4168@dhcp-27-109.brq.redhat.com>
Date:	Fri, 1 Jul 2011 12:09:48 +0200 (CEST)
From:	Lukas Czerner <lczerner@...hat.com>
To:	Tao Ma <tm@....ma>
cc:	linux-ext4@...r.kernel.org, tytso@....edu
Subject: Re: [PATCH 4/4] ext4: Speed up FITRIM by recording flags in
 ext4_group_info.

On Thu, 30 Jun 2011, Tao Ma wrote:

> From: Tao Ma <boyu.mt@...bao.com>
> 
> In ext4, when FITRIM is called every time, we iterate all the
> groups and do trim one by one. It is a bit time wasting if the
> group has been trimmed and there is no change since the last
> trim.
> 
> So this patch adds a new flag in ext4_group_info->bb_state to
> indicate that the group has been trimmed, and it will be cleared
> if some blocks is freed(in release_blocks_on_commit). Another
> trim_minlen is added in ext4_sb_info to record the last minlen
> we use to trim the volume, so that if the caller provide a small
> one, we will go on the trim regardless of the bb_state.
> 
> A simple test with my intel x25m ssd:
> df -h shows:
> /dev/sdb1              40G   21G   17G  56% /mnt/ext4
> Block size:               4096
> 
> run the FITRIM with the following parameter:
> range.start = 0;
> range.len = UINT64_MAX;
> range.minlen = 1048576;
> 
> without the patch:
> [root@...u-tm linux-2.6]# time ./ftrim /mnt/ext4/a
> real	0m5.505s
> user	0m0.000s
> sys	0m1.224s
> [root@...u-tm linux-2.6]# time ./ftrim /mnt/ext4/a
> real	0m5.359s
> user	0m0.000s
> sys	0m1.178s
> [root@...u-tm linux-2.6]# time ./ftrim /mnt/ext4/a
> real	0m5.228s
> user	0m0.000s
> sys	0m1.151s
> 
> with the patch:
> [root@...u-tm linux-2.6]# time ./ftrim /mnt/ext4/a
> real	0m5.625s
> user	0m0.000s
> sys	0m1.269s
> [root@...u-tm linux-2.6]# time ./ftrim /mnt/ext4/a
> real	0m0.002s
> user	0m0.000s
> sys	0m0.001s
> [root@...u-tm linux-2.6]# time ./ftrim /mnt/ext4/a
> real	0m0.002s
> user	0m0.000s
> sys	0m0.001s
> 
> A big improvement for the 2nd and 3rd run.
> 
> Even after I delete some big image files, it is still much
> faster than iterating the whole disk.
> 
> [root@...u-tm test]# time ./ftrim /mnt/ext4/a
> real	0m1.217s
> user	0m0.000s
> sys	0m0.196s

Thanks for the patch Tao, I have couple of comments bellow.

> 
> Reviewed-by: Andreas Dilger <adilger.kernel@...ger.ca>
> Signed-off-by: Tao Ma <boyu.mt@...bao.com>
> ---
>  fs/ext4/ext4.h    |   13 ++++++++++++-
>  fs/ext4/mballoc.c |   20 ++++++++++++++++++++
>  fs/ext4/super.c   |    2 ++
>  3 files changed, 34 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 1921392..5878a22 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1214,6 +1214,9 @@ struct ext4_sb_info {
>  
>  	/* Kernel thread for multiple mount protection */
>  	struct task_struct *s_mmp_tsk;
> +
> +	/* record the last minlen when FITRIM is called. */
> +	atomic_t s_last_trim_minblks;
>  };
>  
>  static inline struct ext4_sb_info *EXT4_SB(struct super_block *sb)
> @@ -2067,11 +2070,19 @@ struct ext4_group_info {
>  					 * 5 free 8-block regions. */
>  };
>  
> -#define EXT4_GROUP_INFO_NEED_INIT_BIT	0
> +#define EXT4_GROUP_INFO_NEED_INIT_BIT		0
> +#define EXT4_GROUP_INFO_WAS_TRIMMED_BIT		1
>  
>  #define EXT4_MB_GRP_NEED_INIT(grp)	\
>  	(test_bit(EXT4_GROUP_INFO_NEED_INIT_BIT, &((grp)->bb_state)))
>  
> +#define EXT4_MB_GRP_WAS_TRIMMED(grp)	\
> +	(test_bit(EXT4_GROUP_INFO_WAS_TRIMMED_BIT, &((grp)->bb_state)))
> +#define EXT4_MB_GRP_SET_TRIMMED(grp)	\
> +	(set_bit(EXT4_GROUP_INFO_WAS_TRIMMED_BIT, &((grp)->bb_state)))
> +#define EXT4_MB_GRP_CLEAR_TRIMMED(grp)	\
> +	(clear_bit(EXT4_GROUP_INFO_WAS_TRIMMED_BIT, &((grp)->bb_state)))
> +
>  #define EXT4_MAX_CONTENTION		8
>  #define EXT4_CONTENTION_THRESHOLD	2
>  
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index a822f2a..afdd869 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2628,6 +2628,15 @@ static void release_blocks_on_commit(journal_t *journal, transaction_t *txn)
>  		rb_erase(&entry->node, &(db->bb_free_root));
>  		mb_free_blocks(NULL, &e4b, entry->start_blk, entry->count);
>  
> +		/*
> +		 * Clear the trimmed flag for the group so that the next
> +		 * ext4_trim_fs can trim it.
> +		 * If the volume is mounted with -o discard, online discard
> +		 * is supported and the free blocks will be trimmed online.
> +		 */
> +		if (!test_opt(sb, DISCARD))
> +			EXT4_MB_GRP_CLEAR_TRIMMED(db);

If the online discard failed we should probably clear the flag as well
right ? However I am not sure how helpful it would be since it'll most
likely fail again. Maybe we do not need to bother with that case at all,
what do you think ?

> +
>  		if (!db->bb_free_root.rb_node) {
>  			/* No more items in the per group rb tree
>  			 * balance refcounts from ext4_mb_free_metadata()
> @@ -4840,6 +4849,10 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
>  	bitmap = e4b.bd_bitmap;
>  
>  	ext4_lock_group(sb, group);
> +	if (EXT4_MB_GRP_WAS_TRIMMED(e4b.bd_info) &&
> +	    minblocks >= atomic_read(&EXT4_SB(sb)->s_last_trim_minblks))
> +		goto out;
> +
>  	start = (e4b.bd_info->bb_first_free > start) ?
>  		e4b.bd_info->bb_first_free : start;
>  
> @@ -4871,6 +4884,10 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group,
>  		if ((e4b.bd_info->bb_free - free_count) < minblocks)
>  			break;
>  	}
> +
> +	if (!ret)
> +		EXT4_MB_GRP_SET_TRIMMED(e4b.bd_info);
> +out:
>  	ext4_unlock_group(sb, group);
>  	ext4_mb_unload_buddy(&e4b);
>  
> @@ -4960,6 +4977,9 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range)
>  	}
>  	range->len = trimmed * sb->s_blocksize;
>  
> +	if (!ret)
> +		atomic_set(&EXT4_SB(sb)->s_last_trim_minblks, minlen);
> +
>  out:
>  	return ret;
>  }
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 9ea71aa..080e9f9 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -3086,6 +3086,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  		sbi->s_sectors_written_start =
>  			part_stat_read(sb->s_bdev->bd_part, sectors[1]);
>  
> +	atomic_set(&sbi->s_last_trim_minblks, 0);
> +

I am not sure that this is needed since sbi is allocated with kzalloc,
so it should be zero already.

>  	/* Cleanup superblock name */
>  	for (cp = sb->s_id; (cp = strchr(cp, '/'));)
>  		*cp = '!';
> 

Thanks!
-Lukas
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ