[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20260112143523.31542-1-jiashengjiangcool@gmail.com>
Date: Mon, 12 Jan 2026 14:35:23 +0000
From: Jiasheng Jiang <jiashengjiangcool@...il.com>
To: Filipe Manana <fdmanana@...nel.org>
Cc: clm@...com,
dsterba@...e.com,
linux-btrfs@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] btrfs: reset block group size class when reservations are freed
On Mon, Jan 12, 2026 at 12:39:43 +0000, Filipe Manana fdmanana@...nel.org wrote:
> On Sun, Jan 11, 2026 at 8:25 PM Jiasheng Jiang
> <jiashengjiangcool@...il.com> wrote:
>>
>> Differential analysis of block-group.c shows an inconsistency between
>> btrfs_add_reserved_bytes() and btrfs_free_reserved_bytes().
>>
>> When space is reserved, btrfs_use_block_group_size_class() is called to
>> set a block group's size class, specializing it for a specific allocation
>> size to reduce fragmentation. However, when these reservations are
>> subsequently freed (e.g., due to an error or transaction abort),
>> btrfs_free_reserved_bytes() fails to perform the corresponding cleanup.
>>
>> This leads to a state leak where a block group remains stuck with a
>> specific size class even if it contains no used or reserved bytes. This
>> stale state causes find_free_extent to unnecessarily skip these block
>> groups for mismatched size requests, leading to suboptimal allocation
>> behavior.
>
> Not necessarily always. If there are subsequent allocations for the
> same extent size, then there's no problem at all.
>
>There's more than skipping, it can cause allocation of new block
>groups if there are none with a matching size class and there aren't
>any without a size class.
>
> I wonder if you observed this in practice and what kind of workload.
>
> I think that should be rephrased because as it's stated it gives the
> wrong idea that it will always cause bad behaviour, while in reality
> that depends a lot on the workload.
You are right. This inconsistency was identified through differential analysis of the space accounting logic. I haven't observed it in a specific production workload yet. I will rephrase the description in v3 to clarify that the impact is workload-dependent and can lead to unnecessary allocation of new block groups.
>>
>> Fix this by resetting the size class to BTRFS_BG_SZ_NONE in
>> btrfs_free_reserved_bytes() when the block group becomes completely
>> empty.
>>
>> Fixes: 606d1bf10d7e ("btrfs: migrate the block group space accounting helpers")
>
> What? That's completely wrong.
>
> First, that commit only moved code around.
> Secondly, that commit happened (2019) before we had support for block
> group size classes (2022).
>
> The proper commit would be 52bb7a2166af ("btrfs: introduce size class
> to block group allocator").
My apologies for the oversight. I will correct the Fixes tag to 52bb7a2166af in v3.
I will send a v3 with the updated commit message and corrected tags shortly.
Best regards,
Jiasheng
Powered by blists - more mailing lists