[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20FD8461-AEE4-4414-8637-4D1230AB5A20@dilger.ca>
Date: Thu, 20 Jul 2023 16:51:41 -0600
From: Andreas Dilger <adilger@...ger.ca>
To: JunChao Sun <sunjunchao2870@...il.com>
Cc: Theodore Ts'o <tytso@....edu>, Jan Kara <jack@...e.cz>,
Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH v2] ext4: Replace value of xattrs in place
On May 9, 2023, at 6:14 PM, JunChao Sun <sunjunchao2870@...il.com> wrote:
>
> When replacing the value of an xattr found in an ea_inode, currently
> ext4 will evict the ea_inode that stores the old value, recreate an
> ea_inode, and then write the new value into the new ea_inode.
> This can be optimized by writing the new value into the old
> ea_inode directly.
Sorry for the long delay in reviewing this patch.
The performance gains are nice, and reducing overhead is always good.
One question about this patch is whether it is safe to overwrite the
xattr in place if the system crashes during the overwrite? That was
one of the reasons why the xattr update was implemented by writing to
a new xattr inode, and then atomically swapping xattr inode numbers.
However, if the xattr overwrite is done via journaled data writes then
it would be safe to "overwrite" the xattr data "in place", because
data will first be committed to the journal and then checkpointed into
the inode itself, so it should never be inconsistent/corrupted.
Did you also test cases where the xattr size is growing or shrinking
during the overwrite in place? That should allocate or free blocks
in the xattr inode so that they are not wasted.
Cheers, Andreas
> The logic for replacing value of xattrs without this patch
> is as follows:
> ext4_xattr_set_entry()
> ->ext4_xattr_inode_iget(&old_ea_inode)
> ->ext4_xattr_inode_lookup_create(&new_ea_inode)
> ->ext4_xattr_inode_dec_ref(old_ea_inode)
> ->iput(old_ea_inode)
> ->ext4_destroy_inode()
> ->ext4_evict_inode()
> ->ext4_free_inode()
> ->iput(new_ea_inode)
>
> The logic with this patch is:
> ext4_xattr_set_entry()
> ->ext4_xattr_inode_iget(&old_ea_inode)
> ->ext4_xattr_inode_write(old_ea_inode, new_value)
> ->iput(old_ea_inode)
>
> This patch reduces the time it takes to replace xattrs in the ext4.
> Without this patch, replacing the value of an xattr two million times takes
> about 45 seconds on Intel(R) Xeon(R) CPU E5-2620 v3 platform.
> With this patch, the same operation takes only 6 seconds.
>
> [root@...ent01 sjc]# ./mount.sh
> /dev/sdb1 contains a ext4 file system
> last mounted on /mnt/ext4 on Mon May 8 17:05:38 2023
> [root@...ent01 sjc]# touch /mnt/ext4/file1
> [root@...ent01 sjc]# gcc test.c
> [root@...ent01 sjc]# time ./a.out
>
> real 0m45.248s
> user 0m0.513s
> sys 0m39.231s
>
> [root@...ent01 sjc]# ./mount.sh
> /dev/sdb1 contains a ext4 file system
> last mounted on /mnt/ext4 on Mon May 8 17:08:20 2023
> [root@...ent01 sjc]# touch /mnt/ext4/file1
> [root@...ent01 sjc]# time ./a.out
>
> real 0m5.977s
> user 0m0.316s
> sys 0m5.659s
>
> The test.c and mount.sh are in [1].
> This patch passed the tests with xfstests using 'check -g quick'.
>
> [1] https://gist.github.com/sjc2870/c923d7fa627d10ab65d6c305afb02cdb
>
> Signed-off-by: JunChao Sun <sunjunchao2870@...il.com>
> ---
>
> Changes in v2:
> - Fix a problem when ref of an ea_inode not equal to 1
> - Link to v1: https://lore.kernel.org/linux-ext4/20230509011042.11781-1-sunjunchao2870@gmail.com/
>
> fs/ext4/xattr.c | 36 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
>
> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> index d57408cbe903..8f03958bfcc6 100644
> --- a/fs/ext4/xattr.c
> +++ b/fs/ext4/xattr.c
> @@ -1713,6 +1713,42 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
> }
> }
>
> + if (!s->not_found && i->value && here->e_value_inum && i->in_inode) {
> + /* Replace xattr value in ea_inode in place */
> + int size_diff = i->value_len - le32_to_cpu(here->e_value_size);
> +
> + ret = ext4_xattr_inode_iget(inode,
> + le32_to_cpu(here->e_value_inum),
> + le32_to_cpu(here->e_hash),
> + &old_ea_inode);
> + if (ret) {
> + old_ea_inode = NULL;
> + goto out;
> + }
> + if (ext4_xattr_inode_get_ref(old_ea_inode) == 1) {
> + if (size_diff > 0)
> + ret = ext4_xattr_inode_alloc_quota(inode, size_diff);
> + else if (size_diff < 0)
> + ext4_xattr_inode_free_quota(inode, NULL, -size_diff);
> + if (ret)
> + goto out;
> +
> + ret = ext4_xattr_inode_write(handle, old_ea_inode, i->value, i->value_len);
> + if (ret) {
> + if (size_diff > 0)
> + ext4_xattr_inode_free_quota(inode, NULL, size_diff);
> + else if (size_diff < 0)
> + ret = ext4_xattr_inode_alloc_quota(inode, -size_diff);
> + goto out;
> + }
> + here->e_value_size = cpu_to_le32(i->value_len);
> + new_ea_inode = old_ea_inode;
> + old_ea_inode = NULL;
> + goto update_hash;
> + } else
> + iput(old_ea_inode);
> + }
> +
> /*
> * Getting access to old and new ea inodes is subject to failures.
> * Finish that work before doing any modifications to the xattr data.
> --
> 1.8.3.1
>
Cheers, Andreas
Download attachment "signature.asc" of type "application/pgp-signature" (874 bytes)
Powered by blists - more mailing lists