lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL3q7H7V0XQVvOiX8VwyZDzdM6qJeSbcq32sVYducLnNc4+ZWA@mail.gmail.com>
Date: Wed, 1 Oct 2025 22:50:59 +0100
From: Filipe Manana <fdmanana@...nel.org>
To: Miquel Sabaté Solà <mssola@...ola.com>
Cc: linux-btrfs@...r.kernel.org, clm@...com, dsterba@...e.com, 
	fdmanana@...e.com, wqu@...e.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] btrfs: prevent a double kfree on delayed-ref

On Wed, Oct 1, 2025 at 7:05 PM Miquel Sabaté Solà <mssola@...ola.com> wrote:
>
> In the previous code it was possible to incur into a double kfree()
> scenario when calling add_delayed_ref_head(). This could happen if the
> record was reported to already exist in the
> btrfs_qgroup_trace_extent_nolock() call, but then there was an error
> later on add_delayed_ref_head(). In this case, since
> add_delayed_ref_head() returned an error, the caller went to free the
> record. Since add_delayed_ref_head() couldn't set this kfree'd pointer
> to NULL, then kfree() would have acted on a non-NULL 'record' object
> which was pointing to memory already freed by the callee.
>
> The problem comes from the fact that the responsibility to kfree the
> object is on both the caller and the callee at the same time. Hence, the
> fix for this is to shift the ownership of the 'qrecord' object out of
> the add_delayed_ref_head(). That is, we will never attempt to kfree()
> the given object inside of this function, and will expect the caller to
> act on the 'qrecord' object on its own. The only exception where the
> 'qrecord' object cannot be kfree'd is if it was inserted into the
> tracing logic, for which we already have the 'qrecord_inserted_ret'
> boolean to account for this. Hence, the caller has to kfree the object
> only if add_delayed_ref_head() reports not to have inserted it on the
> tracing logic.
>
> As a side-effect of the above, we must guarantee that
> 'qrecord_inserted_ret' is properly initialized at the start of the
> function, not at the end, and then set when an actual insert
> happens. This way we avoid 'qrecord_inserted_ret' having an invalid
> value on an early exit.
>
> The documentation from the add_delayed_ref_head() has also been updated
> to reflect on the exact ownership of the 'qrecord' object.
>
> Fixes: 6ef8fbce0104 ("btrfs: fix missing error handling when adding delayed ref with qgroups enabled")
> Signed-off-by: Miquel Sabaté Solà <mssola@...ola.com>

Reviewed-by: Filipe Manana <fdmanana@...e.com>

Thanks. I've pushed this to for-next [1] and renamed the subject to be
more specific:

"btrfs: fix double free of qgroup record after failure to add delayed ref head"

[1] https://github.com/btrfs/linux/commits/for-next/

> ---
>
> Changes in v2:
>
> - Change documentation to reflect that we are using an xarray instead of an
>   rbtree.
> - Change warning to assertion on qrecord_insert_ret.
> - Fix function notation on comments.
> - Fix commit message as suggested
>
>  fs/btrfs/delayed-ref.c | 43 ++++++++++++++++++++++++++++++++----------
>  1 file changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
> index 481802efaa14..f8fc26272f76 100644
> --- a/fs/btrfs/delayed-ref.c
> +++ b/fs/btrfs/delayed-ref.c
> @@ -798,9 +798,13 @@ static void init_delayed_ref_head(struct btrfs_delayed_ref_head *head_ref,
>  }
>
>  /*
> - * helper function to actually insert a head node into the rbtree.
> - * this does all the dirty work in terms of maintaining the correct
> - * overall modification count.
> + * Helper function to actually insert a head node into the xarray. This does all
> + * the dirty work in terms of maintaining the correct overall modification
> + * count.
> + *
> + * The caller is responsible for calling kfree() on @qrecord. More specifically,
> + * if this function reports that it did not insert it as noted in
> + * @qrecord_inserted_ret, then it's safe to call kfree() on it.
>   *
>   * Returns an error pointer in case of an error.
>   */
> @@ -814,7 +818,14 @@ add_delayed_ref_head(struct btrfs_trans_handle *trans,
>         struct btrfs_delayed_ref_head *existing;
>         struct btrfs_delayed_ref_root *delayed_refs;
>         const unsigned long index = (head_ref->bytenr >> fs_info->sectorsize_bits);
> -       bool qrecord_inserted = false;
> +
> +       /*
> +        * If 'qrecord_inserted_ret' is provided, then the first thing we need
> +        * to do is to initialize it to false just in case we have an exit
> +        * before trying to insert the record.
> +        */
> +       if (qrecord_inserted_ret)
> +               *qrecord_inserted_ret = false;
>
>         delayed_refs = &trans->transaction->delayed_refs;
>         lockdep_assert_held(&delayed_refs->lock);
> @@ -833,6 +844,12 @@ add_delayed_ref_head(struct btrfs_trans_handle *trans,
>
>         /* Record qgroup extent info if provided */
>         if (qrecord) {
> +               /*
> +                * Setting 'qrecord' but not 'qrecord_inserted_ret' will likely
> +                * result in a memory leakage.
> +                */
> +               ASSERT(qrecord_inserted_ret != NULL);
> +
>                 int ret;
>
>                 ret = btrfs_qgroup_trace_extent_nolock(fs_info, delayed_refs, qrecord,
> @@ -840,12 +857,10 @@ add_delayed_ref_head(struct btrfs_trans_handle *trans,
>                 if (ret) {
>                         /* Clean up if insertion fails or item exists. */
>                         xa_release(&delayed_refs->dirty_extents, index);
> -                       /* Caller responsible for freeing qrecord on error. */
>                         if (ret < 0)
>                                 return ERR_PTR(ret);
> -                       kfree(qrecord);
> -               } else {
> -                       qrecord_inserted = true;
> +               } else if (qrecord_inserted_ret) {
> +                       *qrecord_inserted_ret = true;
>                 }
>         }
>
> @@ -888,8 +903,6 @@ add_delayed_ref_head(struct btrfs_trans_handle *trans,
>                 delayed_refs->num_heads++;
>                 delayed_refs->num_heads_ready++;
>         }
> -       if (qrecord_inserted_ret)
> -               *qrecord_inserted_ret = qrecord_inserted;
>
>         return head_ref;
>  }
> @@ -1049,6 +1062,14 @@ static int add_delayed_ref(struct btrfs_trans_handle *trans,
>                 xa_release(&delayed_refs->head_refs, index);
>                 spin_unlock(&delayed_refs->lock);
>                 ret = PTR_ERR(new_head_ref);
> +
> +               /*
> +                * It's only safe to call kfree() on 'qrecord' if
> +                * add_delayed_ref_head() has _not_ inserted it for
> +                * tracing. Otherwise we need to handle this here.
> +                */
> +               if (!qrecord_reserved || qrecord_inserted)
> +                       goto free_head_ref;
>                 goto free_record;
>         }
>         head_ref = new_head_ref;
> @@ -1071,6 +1092,8 @@ static int add_delayed_ref(struct btrfs_trans_handle *trans,
>
>         if (qrecord_inserted)
>                 return btrfs_qgroup_trace_extent_post(trans, record, generic_ref->bytenr);
> +
> +       kfree(record);
>         return 0;
>
>  free_record:
> --
> 2.51.0
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ