lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAMmgxWE2ngnvQq-8qfQgS+ykx+9sSQ4kcLt4Ad0yEdKQyOUGrw@mail.gmail.com>
Date:   Tue, 12 Jul 2022 09:37:38 +0800
From:   bingjing chang <bxxxjxxg@...il.com>
To:     Filipe Manana <fdmanana@...nel.org>
Cc:     bingjingc <bingjingc@...ology.com>,
        Josef Bacik <josef@...icpanda.com>,
        David Sterba <dsterba@...e.com>, Chris Mason <clm@...com>,
        linux-btrfs <linux-btrfs@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Robbie Ko <robbieko@...ology.com>
Subject: Re: [PATCH 2/2] btrfs: send: fix a bug that sending a link command on
 existing file path

Filipe Manana <fdmanana@...nel.org> 於 2022年7月11日 週一 晚上10:10寫道:
>
> On Sun, Jul 10, 2022 at 11:07:10PM +0800, bingjing chang wrote:
> > Hi Filipe,
> >
> > Thank you for the review and comments. It really helps.
> > I will submit a patch v2 to fix the problems.
> >
> > Filipe Manana <fdmanana@...nel.org> 於 2022年7月7日 週四 凌晨12:04寫道:
> > >
> > > On Wed, Jul 06, 2022 at 09:09:03PM +0800, bingjingc wrote:
> > > > From: BingJing Chang <bingjingc@...ology.com>
> > > >
> > > > btrfs_ioctl_send processes recorded btrfs_keys in a defined order. (First,
> > > > we process a btrfs_key with a samller objectid. If btrfs_keys have the same
> > > > objectid, then we compare their types and offsets accordingly.)
> > >
> > > That's a very convoluted way to simply say that send processes keys in the order
> > > they are found in the btrees, from the leftmost to the rightmost.
> > >
> > > > However,
> > > > reference paths for an inode can be stored in either BTRFS_INODE_REF_KEY
> > > > btrfs_keys or BTRFS_INODE_EXTREF_KEY btrfs_keys. And due to the limitation
> > > > of the helper function - iterate_inode_ref, we can only iterate the entries
> > > > of ONE btrfs_inode_ref or btrfs_inode_extref. That is, there must be a bug
> > > > in processing the same reference paths, which are stored in different ways.
> > >
> > > When you say "there must be a bug", you want to say "there is a bug", and then
> > > explain what the bug is.
> > >
> > > However the bug has nothing to do with the order of how keys are processed.
> > >
> > > What happens is that when we are processing an inode we go over all references
> > > and add all the new names to the "new_refs" list and add all the deleted names
> > > to the "deleted_refs" list.
> > >
> > > Then in the end, when we finish processing the inode, we iterate over all the
> > > new names in the "new_refs" list and send link commands for those names. After
> > > that we go over all the names in the "deleted_refs" list and send unlink commands
> > > for those names.
> > >
> > > The problem in this case, is that we have names duplicated in both lists.
> > > So we try to create them before we remove them, therefore the receiver gets an
> > > -EEXIST error when trying the link operations.
> > >
> >
> > Yes, the problem happens when we have names duplicated in both lists.
> > And with the current logics in process_recorded_refs(), we will issue
> > links on existing files.
> >
> > I will try to make the description clear in patch v2.
> >
> > > >
> > > > Here is an exmple that btrfs_ioctl_send will send a link command on an
> > >
> > > exmple -> example
> > >
> > > > existing file path:
> > > >
> > > >   $ btrfs subvolume create test
> > > >
> > > >   # create a file and 2000 hard links to the same inode
> > > >   $ dd if=/dev/zero of=test/1M bs=1M count=1
> > >
> > > touch test/1M
> > >
> > > The data is completely irrelevant here, all we care about are the
> > > hard links of the file. Making the reproducer the minimum necessary
> > > to trigger a bug, makes it much less distracting and easier to grasp.
> > >
> > > >   $ for i in {1..2000}; do link test/1M test/$i ; done
> > > >
> > > >   # take a snapshot for parent snapshot
> > > >   $ btrfs sub snap -r test snap1
> > > >
> > > >   # remove 2000 hard links and re-create the last 1000 links
> > > >   $ for i in {1..2000}; do rm test/$i; done;
> > > >   $ for i in {1001..2000}; do link test/1M test/$i; done
> > > >
> > > >   # take another one for sned snapshot
> > >
> > > sned -> send
> > >
> > > >   $ btrfs sub snap -r test snap2
> > > >
> > > >   $ mkdir tmp
> > > >   $ btrfs send -e snap2 -p snap1 | btrfs receive tmp/
> > >
> > > The -e is not necessary.
> > >
> > > >   At subvol snap2
> > > >   link 1238 -> 1M
> > > >   ERROR: link 1238 -> 1M failed: File exists
> > > >
> > > > In the parent snapshot snap1, reference paths 1 - 1237 are stored in a
> > > > INODE_REF item and the rest ones are stored in other INODE_EXTREF items.
> > > > But in the send snapshot snap2, all reference paths can be stored within a
> > > > INODE_REF item. During the send process, btrfs_ioctl_send will process the
> > > > INODE_REF item first. Then it found that reference paths 1 - 1000 were
> > > > gone in the send snapshot, so it recorded them in sctx->deleted_refs for
> > > > unlink. And it found that reference paths 1238 - 2000 were new paths in
> > > > the send snapshot, so it recorded them in sctx->new_refs for link. Since
> > > > we do not load all contents of its INODE_EXTREF items to make comparison,
> > > > afterwards, btrfs_ioctl_send may make a mistake sending a link command on
> > > > an existing file path.
> > >
> > > So this explanation is not correct, it's neither about the names being stored
> > > in an INODE_REF or INODE_EXTREF item nor about processing one item before or
> > > after the other. As mentioned before, it's because we always process the
> > > "new_refs" list before the "deleted_refs" list, and in this case we have the
> > > same names (paths) in both lists.
> > >
> >
> > Sorry, I didn't make this paragraph easy to understand.
> > Here, I want to address that there is already a function find_iref(), which
> > can be used to check duplications. But it has limitations.
>
> Yes, I got it. Its problem is that it works only for the names inside a single
> ref (or extref) item.
>
> Your solution basically is a version of find_iref() that works across all the
> ref/extref items of an inode.
>

Yes, that is. Thank you for your understanding.

> >
> > For this example, we delete 2000 files and recreate 1000 files. Not all of
> > the 2000 file paths are added to the "deleted_refs" list. And not all of
> > the 1000 re-created file paths are added to the "new_refs" list. With
> > find_iref(), when processing inode references, {1001..1237} are not added
> > to any lists because they appear in both inode references. Afterwards,
> > two lists contain items as below:
> > "new_refs" list: {1238..2000}
> > "deleted_refs" list: {1..1000} + {1238..2000}
> > Therefore, there are duplicated items {1238..2000} in both lists. It results
> > in link failure.
>
> Yep. Many names got moved from an extref item into a regular ref item, which
> is what leads to the duplicate names in both lists.
>
> >
> > > >
> > > > To fix the bug, we can either remove the duplicated items both in
> > > > sctx->new_refs and sctx->deleted_refs before generating link/unlink
> > > > commands or prevent them from being added into list. Both of them require
> > > > efficient data structures like C++ sets to look up duplicated items.
> > >
> > > There's a much more obvious alternative, which is processing first the
> > > "deleted_refs" list before the "new_refs" list.
> > >
> > > It's inefficient because we do a bunch of unlinks followed by links for
> > > the same paths. On the other hand, it does not require maintaining two
> > > new rbtrees and allocating memory for records in these rbtrees.
> > >
> >
> > Thank you for mentioning this. I will describe the intuition idea and the
> > difficulties. I was unable to reshuffle the processing order. If someone can
> > do this, we are glad to pick that solution.
>
> I just tried that today, but it's harder than I though, I couldn't find a way
> for it to work without breaking other scenarios (mostly to get orphanization
> right and the need to recompute paths after orphanization).
>
> So I think we can go with your solution.
> I would just like to see two changes:
>
> 1) An updated changelog as mentioned before;
>
> 2) Change the code to only use the new infrastructure to manage refs.
>    The code you added is a parallel infrastructure that is more robust and
>    deals with that case of names being moved from an extref item to a ref
>    item. Having only one infrastructure makes things much easier to maintain
>    and a lot less code, and it also allows to rename record_ref2() to simply
>    record_ref(), as the old record_ref() is now gone.
>    I made those changes in the following patch, which you can apply on top
>    of your patch, or if you prefer I can send it separately.
>
> What do you think?
>
I just revised my patch. I'd like to ask you for a favor to submit a
separate one. Thanks a lot.

> The cleanup patch is the following, and it passes fstests and some light
> stress testing:
>
> From: Filipe Manana <fdmanana@...e.com>
> Date: Mon, 11 Jul 2022 14:55:26 +0100
> Subject: [PATCH] btrfs: send: always use the rbtree based inode ref management
>  infrastructure
>
> Signed-off-by: Filipe Manana <fdmanana@...e.com>
> ---
>  fs/btrfs/send.c | 203 +++++-------------------------------------------
>  1 file changed, 20 insertions(+), 183 deletions(-)
>
> diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
> index 37e3ba5c8586..797e2fb3b26b 100644
> --- a/fs/btrfs/send.c
> +++ b/fs/btrfs/send.c
> @@ -2189,7 +2189,7 @@ static int __get_cur_name_and_parent(struct send_ctx *sctx,
>         /*
>          * If the inode is not existent yet, add the orphan name and return 1.
>          * This should only happen for the parent dir that we determine in
> -        * __record_new_ref
> +        * __record_new_ref_if_needed().
>          */
>         ret = is_inode_existent(sctx, ino, gen);
>         if (ret < 0)
> @@ -2790,27 +2790,6 @@ static void set_ref_path(struct recorded_ref *ref, struct fs_path *path)
>         ref->name_len = ref->full_path->end - ref->name;
>  }
>
> -/*
> - * We need to process new refs before deleted refs, but compare_tree gives us
> - * everything mixed. So we first record all refs and later process them.
> - * This function is a helper to record one ref.
> - */
> -static int __record_ref(struct list_head *head, u64 dir,
> -                     u64 dir_gen, struct fs_path *path)
> -{
> -       struct recorded_ref *ref;
> -
> -       ref = recorded_ref_alloc();
> -       if (!ref)
> -               return -ENOMEM;
> -
> -       ref->dir = dir;
> -       ref->dir_gen = dir_gen;
> -       set_ref_path(ref, path);
> -       list_add_tail(&ref->list, head);
> -       return 0;
> -}
> -
>  static int dup_ref(struct recorded_ref *ref, struct list_head *list)
>  {
>         struct recorded_ref *new;
> @@ -4337,56 +4316,6 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move)
>         return ret;
>  }
>
> -static int record_ref(struct btrfs_root *root, u64 dir, struct fs_path *name,
> -                     void *ctx, struct list_head *refs)
> -{
> -       int ret = 0;
> -       struct send_ctx *sctx = ctx;
> -       struct fs_path *p;
> -       u64 gen;
> -
> -       p = fs_path_alloc();
> -       if (!p)
> -               return -ENOMEM;
> -
> -       ret = get_inode_info(root, dir, NULL, &gen, NULL, NULL,
> -                       NULL, NULL);
> -       if (ret < 0)
> -               goto out;
> -
> -       ret = get_cur_path(sctx, dir, gen, p);
> -       if (ret < 0)
> -               goto out;
> -       ret = fs_path_add_path(p, name);
> -       if (ret < 0)
> -               goto out;
> -
> -       ret = __record_ref(refs, dir, gen, p);
> -
> -out:
> -       if (ret)
> -               fs_path_free(p);
> -       return ret;
> -}
> -
> -static int __record_new_ref(int num, u64 dir, int index,
> -                           struct fs_path *name,
> -                           void *ctx)
> -{
> -       struct send_ctx *sctx = ctx;
> -       return record_ref(sctx->send_root, dir, name, ctx, &sctx->new_refs);
> -}
> -
> -
> -static int __record_deleted_ref(int num, u64 dir, int index,
> -                               struct fs_path *name,
> -                               void *ctx)
> -{
> -       struct send_ctx *sctx = ctx;
> -       return record_ref(sctx->parent_root, dir, name, ctx,
> -                         &sctx->deleted_refs);
> -}
> -
>  static int rbtree_ref_comp(const void *k, const struct rb_node *node)
>  {
>         const struct recorded_ref *data = k;
> @@ -4422,9 +4351,9 @@ static bool rbtree_ref_less(struct rb_node *node, const struct rb_node *parent)
>         return rbtree_ref_comp(entry, parent) < 0;
>  }
>
> -static int record_ref2(struct rb_root *root, struct list_head *refs,
> -                            struct fs_path *name, u64 dir, u64 dir_gen,
> -                            struct send_ctx *sctx)
> +static int record_ref(struct rb_root *root, struct list_head *refs,
> +                     struct fs_path *name, u64 dir, u64 dir_gen,
> +                     struct send_ctx *sctx)
>  {
>         int ret = 0;
>         struct fs_path *path = NULL;
> @@ -4487,8 +4416,8 @@ static int __record_new_ref_if_needed(int num, u64 dir, int index,
>                 ref = rb_entry(node, struct recorded_ref, node);
>                 recorded_ref_free(ref);
>         } else {
> -               ret = record_ref2(&sctx->rbtree_new_refs, &sctx->new_refs,
> -                                 name, dir, dir_gen, sctx);
> +               ret = record_ref(&sctx->rbtree_new_refs, &sctx->new_refs,
> +                                name, dir, dir_gen, sctx);
>         }
>  out:
>         return ret;
> @@ -4518,9 +4447,9 @@ static int __record_deleted_ref_if_needed(int num, u64 dir, int index,
>                 ref = rb_entry(node, struct recorded_ref, node);
>                 recorded_ref_free(ref);
>         } else {
> -               ret = record_ref2(&sctx->rbtree_deleted_refs,
> -                                 &sctx->deleted_refs, name, dir, dir_gen,
> -                                 sctx);
> +               ret = record_ref(&sctx->rbtree_deleted_refs,
> +                                &sctx->deleted_refs, name, dir, dir_gen,
> +                                sctx);
>         }
>  out:
>         return ret;
> @@ -4564,113 +4493,16 @@ struct find_ref_ctx {
>         int found_idx;
>  };
>
> -static int __find_iref(int num, u64 dir, int index,
> -                      struct fs_path *name,
> -                      void *ctx_)
> -{
> -       struct find_ref_ctx *ctx = ctx_;
> -       u64 dir_gen;
> -       int ret;
> -
> -       if (dir == ctx->dir && fs_path_len(name) == fs_path_len(ctx->name) &&
> -           strncmp(name->start, ctx->name->start, fs_path_len(name)) == 0) {
> -               /*
> -                * To avoid doing extra lookups we'll only do this if everything
> -                * else matches.
> -                */
> -               ret = get_inode_info(ctx->root, dir, NULL, &dir_gen, NULL,
> -                                    NULL, NULL, NULL);
> -               if (ret)
> -                       return ret;
> -               if (dir_gen != ctx->dir_gen)
> -                       return 0;
> -               ctx->found_idx = num;
> -               return 1;
> -       }
> -       return 0;
> -}
> -
> -static int find_iref(struct btrfs_root *root,
> -                    struct btrfs_path *path,
> -                    struct btrfs_key *key,
> -                    u64 dir, u64 dir_gen, struct fs_path *name)
> -{
> -       int ret;
> -       struct find_ref_ctx ctx;
> -
> -       ctx.dir = dir;
> -       ctx.name = name;
> -       ctx.dir_gen = dir_gen;
> -       ctx.found_idx = -1;
> -       ctx.root = root;
> -
> -       ret = iterate_inode_ref(root, path, key, 0, __find_iref, &ctx);
> -       if (ret < 0)
> -               return ret;
> -
> -       if (ctx.found_idx == -1)
> -               return -ENOENT;
> -
> -       return ctx.found_idx;
> -}
> -
> -static int __record_changed_new_ref(int num, u64 dir, int index,
> -                                   struct fs_path *name,
> -                                   void *ctx)
> -{
> -       u64 dir_gen;
> -       int ret;
> -       struct send_ctx *sctx = ctx;
> -
> -       ret = get_inode_info(sctx->send_root, dir, NULL, &dir_gen, NULL,
> -                            NULL, NULL, NULL);
> -       if (ret)
> -               return ret;
> -
> -       ret = find_iref(sctx->parent_root, sctx->right_path,
> -                       sctx->cmp_key, dir, dir_gen, name);
> -       if (ret == -ENOENT)
> -               ret = __record_new_ref_if_needed(num, dir, index, name, sctx);
> -       else if (ret > 0)
> -               ret = 0;
> -
> -       return ret;
> -}
> -
> -static int __record_changed_deleted_ref(int num, u64 dir, int index,
> -                                       struct fs_path *name,
> -                                       void *ctx)
> -{
> -       u64 dir_gen;
> -       int ret;
> -       struct send_ctx *sctx = ctx;
> -
> -       ret = get_inode_info(sctx->parent_root, dir, NULL, &dir_gen, NULL,
> -                            NULL, NULL, NULL);
> -       if (ret)
> -               return ret;
> -
> -       ret = find_iref(sctx->send_root, sctx->left_path, sctx->cmp_key,
> -                       dir, dir_gen, name);
> -       if (ret == -ENOENT)
> -               ret = __record_deleted_ref_if_needed(num, dir, index, name,
> -                                                    sctx);
> -       else if (ret > 0)
> -               ret = 0;
> -
> -       return ret;
> -}
> -
>  static int record_changed_ref(struct send_ctx *sctx)
>  {
>         int ret = 0;
>
>         ret = iterate_inode_ref(sctx->send_root, sctx->left_path,
> -                       sctx->cmp_key, 0, __record_changed_new_ref, sctx);
> +                       sctx->cmp_key, 0, __record_new_ref_if_needed, sctx);
>         if (ret < 0)
>                 goto out;
>         ret = iterate_inode_ref(sctx->parent_root, sctx->right_path,
> -                       sctx->cmp_key, 0, __record_changed_deleted_ref, sctx);
> +                       sctx->cmp_key, 0, __record_deleted_ref_if_needed, sctx);
>         if (ret < 0)
>                 goto out;
>         ret = 0;
> @@ -4701,10 +4533,10 @@ static int process_all_refs(struct send_ctx *sctx,
>
>         if (cmd == BTRFS_COMPARE_TREE_NEW) {
>                 root = sctx->send_root;
> -               cb = __record_new_ref;
> +               cb = __record_new_ref_if_needed;
>         } else if (cmd == BTRFS_COMPARE_TREE_DELETED) {
>                 root = sctx->parent_root;
> -               cb = __record_deleted_ref;
> +               cb = __record_deleted_ref_if_needed;
>         } else {
>                 btrfs_err(sctx->send_root->fs_info,
>                                 "Wrong command %d in process_all_refs", cmd);
> @@ -6544,8 +6376,13 @@ static int record_parent_ref(int num, u64 dir, int index, struct fs_path *name,
>  {
>         struct parent_paths_ctx *ppctx = ctx;
>
> -       return record_ref(ppctx->sctx->parent_root, dir, name, ppctx->sctx,
> -                         ppctx->refs);
> +       /*
> +        * Pass 0 as the generation for the directory, we don't care about it
> +        * here as we have no new references to add, we just want to delete all
> +        * references for an inode.
> +        */
> +       return record_ref(&ppctx->sctx->rbtree_deleted_refs, ppctx->refs, name,
> +                         dir, 0, ppctx->sctx);
>  }
>
>  /*
> --
> 2.35.1
>
>
> Thanks.
>
>
> >
> > > Plus this type of scenario should be very rare. It requires having at least
> > > hundreds of hard links in an inode in order to trigger the creation of extended
> > > references, and then removing and recreating a bunch of those hard links in the
> > > send snapshot. How common is that?
> > >
> > > Is this a case you got into in some user workload, or was it found by reading
> > > and inspecting the code?
> > >
> >
> > It's not common. But it happened in the real world. Some version backup
> > applications seem to use large hard links for some purposes.
> >
> > > I would like to have in the changelog this alternative mentioned, and if it's
> > > not good, then explain why it is not, and why we must follow a different solution.
> > >
> >
> > > It's probably not easy, because at process_recorded_refs() when we unlink we
> > > need to know if any ancestor was orphanized before, which we figure out when
> > > we iterate over the "new_refs" list. So it would likely need some reshuffling
> > > in the logic of how we do things there.
> > >
> > > > And
> > > > we also need to take two scenarios into consideration. One is the most
> > > > common case that one inode has only one reference path. The other is the
> > > > worst case that there are ten thousands of hard links of an inode.
> > > > (BTRFS_LINK_MAX is 65536) So we'd like to introduce rbtree to store the
> > > > computing references. (The tree depth of the worst cases is just 16. And
> > >
> > > computing -> computed
> > >
> > > > it takes less memory to store few entries than hash sets.) And in order
> > > > not to occupy too much moemory, we also introduce
> > >
> > > moemory -> memory
> > >
> > > > __record_new_ref_if_needed() and __record_deleted_ref_if_needed() for
> > > > changed_ref() to check and remove the duplications early.
> > >
> > > Also, the subject:
> > >
> > > "btrfs: send: fix a bug that sending a link command on existing file path"
> > >
> > > sounds a bit awkward, mostly because of the "that".
> > > Something like the following would be more concise:
> > >
> > > "btrfs: send: fix sending link commands for existing file paths"
> > >
> > > >
> > > > Reviewed-by: Robbie Ko <robbieko@...ology.com>
> > > > Signed-off-by: BingJing Chang <bingjingc@...ology.com>
> > > > ---
> > > >  fs/btrfs/send.c | 160 ++++++++++++++++++++++++++++++++++++++++++++++--
> > > >  1 file changed, 156 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
> > > > index 420a86720aa2..23ae631ef23b 100644
> > > > --- a/fs/btrfs/send.c
> > > > +++ b/fs/btrfs/send.c
> > > > @@ -234,6 +234,9 @@ struct send_ctx {
> > > >        * Indexed by the inode number of the directory to be deleted.
> > > >        */
> > > >       struct rb_root orphan_dirs;
> > > > +
> > > > +     struct rb_root rbtree_new_refs;
> > > > +     struct rb_root rbtree_deleted_refs;
> > > >  };
> > > >
> > > >  struct pending_dir_move {
> > > > @@ -2747,6 +2750,8 @@ struct recorded_ref {
> > > >       u64 dir;
> > > >       u64 dir_gen;
> > > >       int name_len;
> > > > +     struct rb_node node;
> > > > +     struct rb_root *root;
> > > >  };
> > > >
> > > >  static struct recorded_ref *recorded_ref_alloc(void)
> > > > @@ -2756,6 +2761,7 @@ static struct recorded_ref *recorded_ref_alloc(void)
> > > >       ref = kzalloc(sizeof(*ref), GFP_KERNEL);
> > > >       if (!ref)
> > > >               return NULL;
> > > > +     RB_CLEAR_NODE(&ref->node);
> > > >       INIT_LIST_HEAD(&ref->list);
> > > >       return ref;
> > > >  }
> > > > @@ -2764,6 +2770,8 @@ static void recorded_ref_free(struct recorded_ref *ref)
> > > >  {
> > > >       if (!ref)
> > > >               return;
> > > > +     if (!RB_EMPTY_NODE(&ref->node))
> > > > +             rb_erase(&ref->node, ref->root);
> > > >       list_del(&ref->list);
> > > >       fs_path_free(ref->full_path);
> > > >       kfree(ref);
> > > > @@ -4373,12 +4381,152 @@ static int __record_deleted_ref(int num, u64 dir, int index,
> > > >                         &sctx->deleted_refs);
> > > >  }
> > > >
> > > > +static int rbtree_ref_comp(const void *k, const struct rb_node *node)
> > > > +{
> > > > +     const struct recorded_ref *data = k;
> > > > +     const struct recorded_ref *ref = rb_entry(node, struct recorded_ref, node);
> > > > +     int result;
> > > > +
> > > > +     if (data->dir > ref->dir)
> > > > +             return 1;
> > > > +     if (data->dir < ref->dir)
> > > > +             return -1;
> > > > +     if (data->dir_gen > ref->dir_gen)
> > > > +             return 1;
> > > > +     if (data->dir_gen < ref->dir_gen)
> > > > +             return -1;
> > > > +     if (data->name_len > ref->name_len)
> > > > +             return 1;
> > > > +     if (data->name_len < ref->name_len)
> > > > +             return -1;
> > > > +     result = strcmp(data->name, ref->name);
> > > > +     if (result > 0)
> > > > +             return 1;
> > > > +     if (result < 0)
> > > > +             return -1;
> > > > +     return 0;
> > > > +}
> > > > +
> > > > +static bool rbtree_ref_less(struct rb_node *node, const struct rb_node *parent)
> > > > +{
> > > > +     const struct recorded_ref *entry = rb_entry(node,
> > > > +                                                 struct recorded_ref,
> > > > +                                                 node);
> > > > +
> > > > +     return rbtree_ref_comp(entry, parent) < 0;
> > > > +}
> > > > +
> > > > +static int record_ref2(struct rb_root *root, struct list_head *refs,
> > >
> > > This is a terrible function name.
> > >
> > > If we end up with this solution, please rename it to something more clear
> > > like record_ref_in_tree() for example. Almost anything other than an existing
> > > function name followed by a number is a better choice.
> > >
> >
> > Thank you for the naming.
> >
> > > This bug is a very good finding, and has been around since forever.
> > >
> > > Thanks.
> > >
> > > > +                          struct fs_path *name, u64 dir, u64 dir_gen,
> > > > +                          struct send_ctx *sctx)
> > > > +{
> > > > +     int ret = 0;
> > > > +     struct fs_path *path = NULL;
> > > > +     struct recorded_ref *ref = NULL;
> > > > +
> > > > +     path = fs_path_alloc();
> > > > +     if (!path) {
> > > > +             ret = -ENOMEM;
> > > > +             goto out;
> > > > +     }
> > > > +
> > > > +     ref = recorded_ref_alloc();
> > > > +     if (!ref) {
> > > > +             ret = -ENOMEM;
> > > > +             goto out;
> > > > +     }
> > > > +
> > > > +     ret = get_cur_path(sctx, dir, dir_gen, path);
> > > > +     if (ret < 0)
> > > > +             goto out;
> > > > +     ret = fs_path_add_path(path, name);
> > > > +     if (ret < 0)
> > > > +             goto out;
> > > > +
> > > > +     ref->dir = dir;
> > > > +     ref->dir_gen = dir_gen;
> > > > +     set_ref_path(ref, path);
> > > > +     list_add_tail(&ref->list, refs);
> > > > +     rb_add(&ref->node, root, rbtree_ref_less);
> > > > +     ref->root = root;
> > > > +out:
> > > > +     if (ret) {
> > > > +             if (path && (!ref || !ref->full_path))
> > > > +                     fs_path_free(path);
> > > > +             recorded_ref_free(ref);
> > > > +     }
> > > > +     return ret;
> > > > +}
> > > > +
> > > > +static int __record_new_ref_if_needed(int num, u64 dir, int index,
> > > > +                                   struct fs_path *name, void *ctx)
> > > > +{
> > > > +     int ret = 0;
> > > > +     struct send_ctx *sctx = ctx;
> > > > +     struct rb_node *node = NULL;
> > > > +     struct recorded_ref data;
> > > > +     struct recorded_ref *ref;
> > > > +     u64 dir_gen;
> > > > +
> > > > +     ret = get_inode_info(sctx->send_root, dir, NULL, &dir_gen, NULL,
> > > > +                          NULL, NULL, NULL);
> > > > +     if (ret < 0)
> > > > +             goto out;
> > > > +
> > > > +     data.dir = dir;
> > > > +     data.dir_gen = dir_gen;
> > > > +     set_ref_path(&data, name);
> > > > +     node = rb_find(&data, &sctx->rbtree_deleted_refs, rbtree_ref_comp);
> > > > +     if (node) {
> > > > +             ref = rb_entry(node, struct recorded_ref, node);
> > > > +             recorded_ref_free(ref);
> > > > +     } else {
> > > > +             ret = record_ref2(&sctx->rbtree_new_refs, &sctx->new_refs,
> > > > +                               name, dir, dir_gen, sctx);
> > > > +     }
> > > > +out:
> > > > +     return ret;
> > > > +}
> > > > +
> > > > +static int __record_deleted_ref_if_needed(int num, u64 dir, int index,
> > > > +                         struct fs_path *name,
> > > > +                         void *ctx)
> > > > +{
> > > > +     int ret = 0;
> > > > +     struct send_ctx *sctx = ctx;
> > > > +     struct rb_node *node = NULL;
> > > > +     struct recorded_ref data;
> > > > +     struct recorded_ref *ref;
> > > > +     u64 dir_gen;
> > > > +
> > > > +     ret = get_inode_info(sctx->parent_root, dir, NULL, &dir_gen, NULL,
> > > > +                          NULL, NULL, NULL);
> > > > +     if (ret < 0)
> > > > +             goto out;
> > > > +
> > > > +     data.dir = dir;
> > > > +     data.dir_gen = dir_gen;
> > > > +     set_ref_path(&data, name);
> > > > +     node = rb_find(&data, &sctx->rbtree_new_refs, rbtree_ref_comp);
> > > > +     if (node) {
> > > > +             ref = rb_entry(node, struct recorded_ref, node);
> > > > +             recorded_ref_free(ref);
> > > > +     } else {
> > > > +             ret = record_ref2(&sctx->rbtree_deleted_refs,
> > > > +                               &sctx->deleted_refs, name, dir, dir_gen,
> > > > +                               sctx);
> > > > +     }
> > > > +out:
> > > > +     return ret;
> > > > +}
> > > > +
> > > >  static int record_new_ref(struct send_ctx *sctx)
> > > >  {
> > > >       int ret;
> > > >
> > > >       ret = iterate_inode_ref(sctx->send_root, sctx->left_path,
> > > > -                             sctx->cmp_key, 0, __record_new_ref, sctx);
> > > > +                             sctx->cmp_key, 0, __record_new_ref_if_needed,
> > > > +                             sctx);
> > > >       if (ret < 0)
> > > >               goto out;
> > > >       ret = 0;
> > > > @@ -4392,7 +4540,8 @@ static int record_deleted_ref(struct send_ctx *sctx)
> > > >       int ret;
> > > >
> > > >       ret = iterate_inode_ref(sctx->parent_root, sctx->right_path,
> > > > -                             sctx->cmp_key, 0, __record_deleted_ref, sctx);
> > > > +                             sctx->cmp_key, 0,
> > > > +                             __record_deleted_ref_if_needed, sctx);
> > > >       if (ret < 0)
> > > >               goto out;
> > > >       ret = 0;
> > > > @@ -4475,7 +4624,7 @@ static int __record_changed_new_ref(int num, u64 dir, int index,
> > > >       ret = find_iref(sctx->parent_root, sctx->right_path,
> > > >                       sctx->cmp_key, dir, dir_gen, name);
> > > >       if (ret == -ENOENT)
> > > > -             ret = __record_new_ref(num, dir, index, name, sctx);
> > > > +             ret = __record_new_ref_if_needed(num, dir, index, name, sctx);
> > > >       else if (ret > 0)
> > > >               ret = 0;
> > > >
> > > > @@ -4498,7 +4647,8 @@ static int __record_changed_deleted_ref(int num, u64 dir, int index,
> > > >       ret = find_iref(sctx->send_root, sctx->left_path, sctx->cmp_key,
> > > >                       dir, dir_gen, name);
> > > >       if (ret == -ENOENT)
> > > > -             ret = __record_deleted_ref(num, dir, index, name, sctx);
> > > > +             ret = __record_deleted_ref_if_needed(num, dir, index, name,
> > > > +                                                  sctx);
> > > >       else if (ret > 0)
> > > >               ret = 0;
> > > >
> > > > @@ -7576,6 +7726,8 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg)
> > > >       sctx->pending_dir_moves = RB_ROOT;
> > > >       sctx->waiting_dir_moves = RB_ROOT;
> > > >       sctx->orphan_dirs = RB_ROOT;
> > > > +     sctx->rbtree_new_refs = RB_ROOT;
> > > > +     sctx->rbtree_deleted_refs = RB_ROOT;
> > > >
> > > >       sctx->clone_roots = kvcalloc(sizeof(*sctx->clone_roots),
> > > >                                    arg->clone_sources_count + 1,
> > > > --
> > > > 2.37.0
> > > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ