[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL3q7H5TPCmtbupb_gQuEnvFhh2dKU89T6C2TsUJqts8gxW00w@mail.gmail.com>
Date: Sun, 11 Jan 2026 19:32:03 +0000
From: Filipe Manana <fdmanana@...nel.org>
To: Jiasheng Jiang <jiashengjiangcool@...il.com>
Cc: Chris Mason <clm@...com>, David Sterba <dsterba@...e.com>, linux-btrfs@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] btrfs: reset block group size class when reservations are freed
On Sat, Jan 10, 2026 at 8:34 PM Jiasheng Jiang
<jiashengjiangcool@...il.com> wrote:
>
> Differential analysis of block-group.c shows an inconsistency between
> btrfs_add_reserved_bytes() and btrfs_free_reserved_bytes().
>
> When space is reserved, btrfs_use_block_group_size_class() is called to
> set a block group's size class, specializing it for a specific allocation
> size to reduce fragmentation. However, when these reservations are
> subsequently freed (e.g., due to an error or transaction abort),
> btrfs_free_reserved_bytes() fails to perform the corresponding cleanup.
>
> This leads to a state leak where a block group remains stuck with a
> specific size class even if it contains no used or reserved bytes. This
> stale state causes find_free_extent to unnecessarily skip these block
> groups for mismatched size requests, leading to suboptimal allocation
> behavior.
>
> Fix this by resetting the size class to BTRFS_BG_SZ_NONE in
> btrfs_free_reserved_bytes() when the block group becomes completely
> empty.
>
> Signed-off-by: Jiasheng Jiang <jiashengjiangcool@...il.com>
> ---
> fs/btrfs/block-group.c | 11 +++++++++++
> fs/btrfs/block-group.h | 1 +
> 2 files changed, 12 insertions(+)
>
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 08b14449fabe..1ecac4613a3e 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -3865,6 +3865,10 @@ void btrfs_free_reserved_bytes(struct btrfs_block_group *cache, u64 num_bytes,
>
> spin_lock(&space_info->lock);
> spin_lock(&cache->lock);
> +
> + if (btrfs_block_group_should_use_size_class(cache))
> + btrfs_maybe_reset_size_class(cache);
This will do nothing, since we decrement the block group's reserved
counter below, and btrfs_maybe_reset_size_class() only resets the
size class if cache->reserved == 0.
> +
> bg_ro = cache->ro;
> cache->reserved -= num_bytes;
> if (is_delalloc)
> @@ -4717,3 +4721,10 @@ bool btrfs_block_group_should_use_size_class(const struct btrfs_block_group *bg)
> return false;
> return true;
> }
> +
> +void btrfs_maybe_reset_size_class(struct btrfs_block_group *bg)
> +{
> + lockdep_assert_held(&bg->lock);
> + if (bg->used == 0 && bg->reserved == 0)
> + bg->size_class = BTRFS_BG_SZ_NONE;
> +}
This is so short and only used in one place.
So no need to make this a function and certainly no need to export it
as it's only used in this file.
Thanks.
> diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
> index 5f933455118c..7e02db8a8bc6 100644
> --- a/fs/btrfs/block-group.h
> +++ b/fs/btrfs/block-group.h
> @@ -395,5 +395,6 @@ int btrfs_use_block_group_size_class(struct btrfs_block_group *bg,
> enum btrfs_block_group_size_class size_class,
> bool force_wrong_size_class);
> bool btrfs_block_group_should_use_size_class(const struct btrfs_block_group *bg);
> +void btrfs_maybe_reset_size_class(struct btrfs_block_group *bg);
>
> #endif /* BTRFS_BLOCK_GROUP_H */
> --
> 2.25.1
>
>
Powered by blists - more mailing lists