lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 18 Oct 2023 16:44:23 +0800
From:   JunChao Sun <sunjunchao2870@...il.com>
To:     Andreas Dilger <adilger@...ger.ca>
Cc:     "Theodore Ts'o" <tytso@....edu>, Jan Kara <jack@...e.cz>,
        Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH v2] ext4: Replace value of xattrs in place

Andreas Dilger <adilger@...ger.ca> 于2023年7月21日周五 06:50写道:
>
> On May 9, 2023, at 6:14 PM, JunChao Sun <sunjunchao2870@...il.com> wrote:
> >
> > When replacing the value of an xattr found in an ea_inode, currently
> > ext4 will evict the ea_inode that stores the old value, recreate an
> > ea_inode, and then write the new value into the new ea_inode.
> > This can be optimized by writing the new value into the old
> > ea_inode directly.
>
> Sorry for the long delay in reviewing this patch.
>
> The performance gains are nice, and reducing overhead is always good.
>
>
> > One question about this patch is whether it is safe to overwrite the
> > xattr in place if the system crashes during the overwrite?  That was
> > one of the reasons why the xattr update was implemented by writing to
> > a new xattr inode, and then atomically swapping xattr inode numbers.
> >
> > However, if the xattr overwrite is done via journaled data writes then
> > it would be safe to "overwrite" the xattr data "in place", because
> > data will first be committed to the journal and then checkpointed into
> > the inode itself, so it should never be inconsistent/corrupted.
Thanks for your review. I'm sorry for taking so long to reply..
I checked the relevant code, and ensured that xattr overwriting is
done via journal, so it should be safe.
>
>
> > Did you also test cases where the xattr size is growing or shrinking
> > during the overwrite in place?  That should allocate or free blocks
> > in the xattr inode so that they are not wasted.
> >
Here is a bug in this patch, when xattr size is shrinking, blocks that
were previously allocated but need to be released now will not be
released. Here may need a ext4_truncate to release redundant blocks. I
will check ext4_truncate function and test relative cases, and then
send patch v3.

> Cheers, Andreas
>
> > The logic for replacing value of xattrs without this patch
> > is as follows:
> > ext4_xattr_set_entry()
> >    ->ext4_xattr_inode_iget(&old_ea_inode)
> >    ->ext4_xattr_inode_lookup_create(&new_ea_inode)
> >    ->ext4_xattr_inode_dec_ref(old_ea_inode)
> >    ->iput(old_ea_inode)
> >        ->ext4_destroy_inode()
> >        ->ext4_evict_inode()
> >        ->ext4_free_inode()
> >    ->iput(new_ea_inode)
> >
> > The logic with this patch is:
> > ext4_xattr_set_entry()
> >    ->ext4_xattr_inode_iget(&old_ea_inode)
> >    ->ext4_xattr_inode_write(old_ea_inode, new_value)
> >    ->iput(old_ea_inode)
> >
> > This patch reduces the time it takes to replace xattrs in the ext4.
> > Without this patch, replacing the value of an xattr two million times takes
> > about 45 seconds on Intel(R) Xeon(R) CPU E5-2620 v3 platform.
> > With this patch, the same operation takes only 6 seconds.
> >
> >  [root@...ent01 sjc]# ./mount.sh
> >  /dev/sdb1 contains a ext4 file system
> >      last mounted on /mnt/ext4 on Mon May  8 17:05:38 2023
> >  [root@...ent01 sjc]# touch /mnt/ext4/file1
> >  [root@...ent01 sjc]# gcc test.c
> >  [root@...ent01 sjc]# time ./a.out
> >
> >  real    0m45.248s
> >  user    0m0.513s
> >  sys 0m39.231s
> >
> >  [root@...ent01 sjc]# ./mount.sh
> >  /dev/sdb1 contains a ext4 file system
> >      last mounted on /mnt/ext4 on Mon May  8 17:08:20 2023
> >  [root@...ent01 sjc]# touch /mnt/ext4/file1
> >  [root@...ent01 sjc]# time ./a.out
> >
> >  real    0m5.977s
> >  user    0m0.316s
> >  sys 0m5.659s
> >
> > The test.c and mount.sh are in [1].
> > This patch passed the tests with xfstests using 'check -g quick'.
> >
> > [1] https://gist.github.com/sjc2870/c923d7fa627d10ab65d6c305afb02cdb
> >
> > Signed-off-by: JunChao Sun <sunjunchao2870@...il.com>
> > ---
> >
> > Changes in v2:
> >  - Fix a problem when ref of an ea_inode not equal to 1
> >  - Link to v1: https://lore.kernel.org/linux-ext4/20230509011042.11781-1-sunjunchao2870@gmail.com/
> >
> > fs/ext4/xattr.c | 36 ++++++++++++++++++++++++++++++++++++
> > 1 file changed, 36 insertions(+)
> >
> > diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> > index d57408cbe903..8f03958bfcc6 100644
> > --- a/fs/ext4/xattr.c
> > +++ b/fs/ext4/xattr.c
> > @@ -1713,6 +1713,42 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
> >               }
> >       }
> >
> > +     if (!s->not_found && i->value && here->e_value_inum && i->in_inode) {
> > +             /* Replace xattr value in ea_inode in place */
> > +             int size_diff = i->value_len - le32_to_cpu(here->e_value_size);
> > +
> > +             ret = ext4_xattr_inode_iget(inode,
> > +                                             le32_to_cpu(here->e_value_inum),
> > +                                             le32_to_cpu(here->e_hash),
> > +                                             &old_ea_inode);
> > +             if (ret) {
> > +                     old_ea_inode = NULL;
> > +                     goto out;
> > +             }
> > +             if (ext4_xattr_inode_get_ref(old_ea_inode) == 1) {
> > +                     if (size_diff > 0)
> > +                             ret = ext4_xattr_inode_alloc_quota(inode, size_diff);
> > +                     else if (size_diff < 0)
> > +                             ext4_xattr_inode_free_quota(inode, NULL, -size_diff);
> > +                     if (ret)
> > +                             goto out;
> > +
> > +                     ret = ext4_xattr_inode_write(handle, old_ea_inode, i->value, i->value_len);
> > +                     if (ret) {
> > +                             if (size_diff > 0)
> > +                                     ext4_xattr_inode_free_quota(inode, NULL, size_diff);
> > +                             else if (size_diff < 0)
> > +                                     ret = ext4_xattr_inode_alloc_quota(inode, -size_diff);
> > +                             goto out;
> > +                     }
> > +                     here->e_value_size = cpu_to_le32(i->value_len);
> > +                     new_ea_inode = old_ea_inode;
> > +                     old_ea_inode = NULL;
> > +                     goto update_hash;
> > +             } else
> > +                     iput(old_ea_inode);
> > +     }
> > +
> >       /*
> >        * Getting access to old and new ea inodes is subject to failures.
> >        * Finish that work before doing any modifications to the xattr data.
> > --
> > 1.8.3.1
> >
>
>
> Cheers, Andreas
>
>
>
>
>

Powered by blists - more mailing lists