[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL3q7H7c=Pa1D63M50MEn7PoSqi4K749KD5S2+EaVz=n53h2sw@mail.gmail.com>
Date: Mon, 12 Jan 2026 12:39:43 +0000
From: Filipe Manana <fdmanana@...nel.org>
To: Jiasheng Jiang <jiashengjiangcool@...il.com>
Cc: clm@...com, dsterba@...e.com, linux-btrfs@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] btrfs: reset block group size class when reservations
are freed
On Sun, Jan 11, 2026 at 8:25 PM Jiasheng Jiang
<jiashengjiangcool@...il.com> wrote:
>
> Differential analysis of block-group.c shows an inconsistency between
> btrfs_add_reserved_bytes() and btrfs_free_reserved_bytes().
>
> When space is reserved, btrfs_use_block_group_size_class() is called to
> set a block group's size class, specializing it for a specific allocation
> size to reduce fragmentation. However, when these reservations are
> subsequently freed (e.g., due to an error or transaction abort),
> btrfs_free_reserved_bytes() fails to perform the corresponding cleanup.
>
> This leads to a state leak where a block group remains stuck with a
> specific size class even if it contains no used or reserved bytes. This
> stale state causes find_free_extent to unnecessarily skip these block
> groups for mismatched size requests, leading to suboptimal allocation
> behavior.
Not necessarily always. If there are subsequent allocations for the
same extent size, then there's no problem at all.
There's more than skipping, it can cause allocation of new block
groups if there are none with a matching size class and there aren't
any without a size class.
I wonder if you observed this in practice and what kind of workload.
I think that should be rephrased because as it's stated it gives the
wrong idea that it will always cause bad behaviour, while in reality
that depends a lot on the workload.
>
> Fix this by resetting the size class to BTRFS_BG_SZ_NONE in
> btrfs_free_reserved_bytes() when the block group becomes completely
> empty.
>
> Fixes: 606d1bf10d7e ("btrfs: migrate the block group space accounting helpers")
What? That's completely wrong.
First, that commit only moved code around.
Secondly, that commit happened (2019) before we had support for block
group size classes (2022).
The proper commit would be 52bb7a2166af ("btrfs: introduce size class
to block group allocator").
> Signed-off-by: Jiasheng Jiang <jiashengjiangcool@...il.com>
> ---
> Changelog:
>
> v1 -> v2:
>
> 1. Inlined btrfs_maybe_reset_size_class() function.
> 2. Moved check below the reserved bytes decrement in btrfs_free_reserved_bytes().
> ---
> fs/btrfs/block-group.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 08b14449fabe..8339ad001d3f 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -3867,6 +3867,12 @@ void btrfs_free_reserved_bytes(struct btrfs_block_group *cache, u64 num_bytes,
> spin_lock(&cache->lock);
> bg_ro = cache->ro;
> cache->reserved -= num_bytes;
> +
> + if (btrfs_block_group_should_use_size_class(cache)) {
> + if (cache->used == 0 && cache->reserved == 0)
> + cache->size_class = BTRFS_BG_SZ_NONE;
> + }
> +
> if (is_delalloc)
> cache->delalloc_bytes -= num_bytes;
> spin_unlock(&cache->lock);
> --
> 2.25.1
>
Powered by blists - more mailing lists