[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL3q7H7V0XQVvOiX8VwyZDzdM6qJeSbcq32sVYducLnNc4+ZWA@mail.gmail.com>
Date: Wed, 1 Oct 2025 22:50:59 +0100
From: Filipe Manana <fdmanana@...nel.org>
To: Miquel Sabaté Solà <mssola@...ola.com>
Cc: linux-btrfs@...r.kernel.org, clm@...com, dsterba@...e.com,
fdmanana@...e.com, wqu@...e.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] btrfs: prevent a double kfree on delayed-ref
On Wed, Oct 1, 2025 at 7:05 PM Miquel Sabaté Solà <mssola@...ola.com> wrote:
>
> In the previous code it was possible to incur into a double kfree()
> scenario when calling add_delayed_ref_head(). This could happen if the
> record was reported to already exist in the
> btrfs_qgroup_trace_extent_nolock() call, but then there was an error
> later on add_delayed_ref_head(). In this case, since
> add_delayed_ref_head() returned an error, the caller went to free the
> record. Since add_delayed_ref_head() couldn't set this kfree'd pointer
> to NULL, then kfree() would have acted on a non-NULL 'record' object
> which was pointing to memory already freed by the callee.
>
> The problem comes from the fact that the responsibility to kfree the
> object is on both the caller and the callee at the same time. Hence, the
> fix for this is to shift the ownership of the 'qrecord' object out of
> the add_delayed_ref_head(). That is, we will never attempt to kfree()
> the given object inside of this function, and will expect the caller to
> act on the 'qrecord' object on its own. The only exception where the
> 'qrecord' object cannot be kfree'd is if it was inserted into the
> tracing logic, for which we already have the 'qrecord_inserted_ret'
> boolean to account for this. Hence, the caller has to kfree the object
> only if add_delayed_ref_head() reports not to have inserted it on the
> tracing logic.
>
> As a side-effect of the above, we must guarantee that
> 'qrecord_inserted_ret' is properly initialized at the start of the
> function, not at the end, and then set when an actual insert
> happens. This way we avoid 'qrecord_inserted_ret' having an invalid
> value on an early exit.
>
> The documentation from the add_delayed_ref_head() has also been updated
> to reflect on the exact ownership of the 'qrecord' object.
>
> Fixes: 6ef8fbce0104 ("btrfs: fix missing error handling when adding delayed ref with qgroups enabled")
> Signed-off-by: Miquel Sabaté Solà <mssola@...ola.com>
Reviewed-by: Filipe Manana <fdmanana@...e.com>
Thanks. I've pushed this to for-next [1] and renamed the subject to be
more specific:
"btrfs: fix double free of qgroup record after failure to add delayed ref head"
[1] https://github.com/btrfs/linux/commits/for-next/
> ---
>
> Changes in v2:
>
> - Change documentation to reflect that we are using an xarray instead of an
> rbtree.
> - Change warning to assertion on qrecord_insert_ret.
> - Fix function notation on comments.
> - Fix commit message as suggested
>
> fs/btrfs/delayed-ref.c | 43 ++++++++++++++++++++++++++++++++----------
> 1 file changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
> index 481802efaa14..f8fc26272f76 100644
> --- a/fs/btrfs/delayed-ref.c
> +++ b/fs/btrfs/delayed-ref.c
> @@ -798,9 +798,13 @@ static void init_delayed_ref_head(struct btrfs_delayed_ref_head *head_ref,
> }
>
> /*
> - * helper function to actually insert a head node into the rbtree.
> - * this does all the dirty work in terms of maintaining the correct
> - * overall modification count.
> + * Helper function to actually insert a head node into the xarray. This does all
> + * the dirty work in terms of maintaining the correct overall modification
> + * count.
> + *
> + * The caller is responsible for calling kfree() on @qrecord. More specifically,
> + * if this function reports that it did not insert it as noted in
> + * @qrecord_inserted_ret, then it's safe to call kfree() on it.
> *
> * Returns an error pointer in case of an error.
> */
> @@ -814,7 +818,14 @@ add_delayed_ref_head(struct btrfs_trans_handle *trans,
> struct btrfs_delayed_ref_head *existing;
> struct btrfs_delayed_ref_root *delayed_refs;
> const unsigned long index = (head_ref->bytenr >> fs_info->sectorsize_bits);
> - bool qrecord_inserted = false;
> +
> + /*
> + * If 'qrecord_inserted_ret' is provided, then the first thing we need
> + * to do is to initialize it to false just in case we have an exit
> + * before trying to insert the record.
> + */
> + if (qrecord_inserted_ret)
> + *qrecord_inserted_ret = false;
>
> delayed_refs = &trans->transaction->delayed_refs;
> lockdep_assert_held(&delayed_refs->lock);
> @@ -833,6 +844,12 @@ add_delayed_ref_head(struct btrfs_trans_handle *trans,
>
> /* Record qgroup extent info if provided */
> if (qrecord) {
> + /*
> + * Setting 'qrecord' but not 'qrecord_inserted_ret' will likely
> + * result in a memory leakage.
> + */
> + ASSERT(qrecord_inserted_ret != NULL);
> +
> int ret;
>
> ret = btrfs_qgroup_trace_extent_nolock(fs_info, delayed_refs, qrecord,
> @@ -840,12 +857,10 @@ add_delayed_ref_head(struct btrfs_trans_handle *trans,
> if (ret) {
> /* Clean up if insertion fails or item exists. */
> xa_release(&delayed_refs->dirty_extents, index);
> - /* Caller responsible for freeing qrecord on error. */
> if (ret < 0)
> return ERR_PTR(ret);
> - kfree(qrecord);
> - } else {
> - qrecord_inserted = true;
> + } else if (qrecord_inserted_ret) {
> + *qrecord_inserted_ret = true;
> }
> }
>
> @@ -888,8 +903,6 @@ add_delayed_ref_head(struct btrfs_trans_handle *trans,
> delayed_refs->num_heads++;
> delayed_refs->num_heads_ready++;
> }
> - if (qrecord_inserted_ret)
> - *qrecord_inserted_ret = qrecord_inserted;
>
> return head_ref;
> }
> @@ -1049,6 +1062,14 @@ static int add_delayed_ref(struct btrfs_trans_handle *trans,
> xa_release(&delayed_refs->head_refs, index);
> spin_unlock(&delayed_refs->lock);
> ret = PTR_ERR(new_head_ref);
> +
> + /*
> + * It's only safe to call kfree() on 'qrecord' if
> + * add_delayed_ref_head() has _not_ inserted it for
> + * tracing. Otherwise we need to handle this here.
> + */
> + if (!qrecord_reserved || qrecord_inserted)
> + goto free_head_ref;
> goto free_record;
> }
> head_ref = new_head_ref;
> @@ -1071,6 +1092,8 @@ static int add_delayed_ref(struct btrfs_trans_handle *trans,
>
> if (qrecord_inserted)
> return btrfs_qgroup_trace_extent_post(trans, record, generic_ref->bytenr);
> +
> + kfree(record);
> return 0;
>
> free_record:
> --
> 2.51.0
>
Powered by blists - more mailing lists