lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <AE5D98E4-6F8C-4785-93C0-919FC2DDDB24@dilger.ca>
Date:   Thu, 13 Aug 2020 03:40:57 -0600
From:   Andreas Dilger <adilger@...ger.ca>
To:     brookxu <brookxu.cn@...il.com>
Cc:     Theodore Ts'o <tytso@....edu>,
        Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH] ext4: put grp related checks into ext4_mb_good_group()

On Aug 7, 2020, at 8:01 AM, brookxu <brookxu.cn@...il.com> wrote:
> 
> We will make these judgments in ext4_mb_good_group(), maybe there
> is no need to repeat judgments here.

This patch looks like it _may_ have some performance impact on a large
filesystem that has some of the block groups that are mostly full, or
the group is already marked corrupt.

These checks are trying to avoid initializing the group's block allocation
bitmap(s) from disk and initializing the buddy bitmap(s) if that is not
actually needed.  See comment in ext4_mb_regular_allocator():

		/* This now checks without needing the buddy page */
		ret = ext4_mb_good_group_nolock(ac, group, cr);

		err = ext4_mb_load_buddy(sb, group, &e4b);
		if (err)
			goto out;

		ext4_lock_group(sb, group);

		/*
		 * We need to check again after locking the
		 * block group
		 */
		ret = ext4_mb_good_group(ac, group, cr);

If those checks are not done in ext4_mb_good_group_nolock() to return 0
to the caller, then the call to ext4_mb_init_group() and ext4_mb_load_buddy()
will be done, only to find in ext4_mb_good_group() that the group is no good.

There were also some performance-related patches in the past that show
the calls to ext4_mb_load_buddy() is expensive with real filesystems:

    commit 1c8457cadc9cefe7ec920a2f3537ff1fe20f4061
    Author:     Aditya Kali <adityakali@...gle.com>
    AuthorDate: Sat Jun 30 19:10:57 2012 -0400

    ext4: avoid uneeded calls to ext4_mb_load_buddy() while reading mb_groups

    Currently ext4_mb_load_buddy is called for every group, irrespective
    of whether the group info is already in memory, while reading
    /proc/fs/ext4/<partition>/mb_groups proc file.  For the purpose of
    mb_groups proc file, it is unnecessary to load the file group info
    from disk if it was loaded in past.  These calls to ext4_mb_load_buddy
    make reading the mb_groups proc file expensive.


    commit 78944086663e6c1b03f3d60bf7610128149be5fc
    Author:     Lukas Czerner <lczerner@...hat.com>
    AuthorDate: Tue May 24 18:16:27 2011 -0400

    ext4: only load buddy bitmap in ext4_trim_fs() when it is needed

    Currently we are loading buddy ext4_mb_load_buddy() for every block
    group we are going through in ext4_trim_fs() in many cases just to find
    out that there is not enough space to be bothered with. As Amir Goldstein
    suggested we can use bb_free information directly from ext4_group_info.

    This commit removes ext4_mb_load_buddy() from ext4_trim_fs() and rather
    get the ext4_group_info via ext4_get_group_info() and use the bb_free
    information directly from that. This avoids unnecessary call to load
    buddy in the case the group does not have enough free space to trim.

I think some options still exist for this patch:
- make a helper routine like "ext4_mb_group_unsuitable(ac, grp, cr)
  that does these common checks and use it in both ext4_mb_good_group()
  and ext4_mb_good_group_nolock() to reduce code duplication.  However,
  checks in ext4_mb_good_group() have also been changed/removed in
  another of your patches, so it may not make sense anymore?

- add an optimization in ext4_mb_good_group_nolock() to do the
  unlock/lock only in the unlikely case that GRP_NEED_INIT is true:

	if (should_lock)
		ext4_lock_group(sb, group);
	free = grp->bb_free;
	if (free == 0)
		goto out;
	if (cr <= 2 && free < ac->ac_g_ex.fe_len)
		goto out;
	if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp)))
		goto out;

-	if (should_lock)
-		ext4_unlock_group(sb, group);

        /* We only do this if the grp has never been initialized */
        if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
+	        if (should_lock)
+			ext4_unlock_group(sb, group);
		ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS);
		if (ret)
			return ret;
+		if (should_lock)
+			ext4_lock_group(sb, group);
        }

-	if (should_lock)
-		ext4_lock_group(sb, group);
	ret = ext4_mb_good_group(ac, group, cr);
out:
	if (should_lock)
		ext4_unlock_group(sb, group);
	return ret;
}

That will avoid some lock contention on the group lock in the common case,
which will probably avoid more overhead than .

Cheers, Andreas

> Signed-off-by: Chunguang Xu <brookxu@...cent.com>
> ---
> fs/ext4/mballoc.c | 16 ++--------------
> 1 file changed, 2 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 4304113..84871f7 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -2178,21 +2178,8 @@ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac,
> 	struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group);
> 	struct super_block *sb = ac->ac_sb;
> 	bool should_lock = ac->ac_flags & EXT4_MB_STRICT_CHECK;
> -	ext4_grpblk_t free;
> 	int ret = 0;
> 
> -	if (should_lock)
> -		ext4_lock_group(sb, group);
> -	free = grp->bb_free;
> -	if (free == 0)
> -		goto out;
> -	if (cr <= 2 && free < ac->ac_g_ex.fe_len)
> -		goto out;
> -	if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp)))
> -		goto out;
> -	if (should_lock)
> -		ext4_unlock_group(sb, group);
> -
> 	/* We only do this if the grp has never been initialized */
> 	if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
> 		ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS);
> @@ -2202,8 +2189,9 @@ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac,
> 
> 	if (should_lock)
> 		ext4_lock_group(sb, group);
> +
> 	ret = ext4_mb_good_group(ac, group, cr);
> -out:
> +
> 	if (should_lock)
> 		ext4_unlock_group(sb, group);
> 	return ret;
> --
> 1.8.3.1
> 
> 


Cheers, Andreas






Download attachment "signature.asc" of type "application/pgp-signature" (874 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ