lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL3q7H7c=Pa1D63M50MEn7PoSqi4K749KD5S2+EaVz=n53h2sw@mail.gmail.com>
Date: Mon, 12 Jan 2026 12:39:43 +0000
From: Filipe Manana <fdmanana@...nel.org>
To: Jiasheng Jiang <jiashengjiangcool@...il.com>
Cc: clm@...com, dsterba@...e.com, linux-btrfs@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] btrfs: reset block group size class when reservations
 are freed

On Sun, Jan 11, 2026 at 8:25 PM Jiasheng Jiang
<jiashengjiangcool@...il.com> wrote:
>
> Differential analysis of block-group.c shows an inconsistency between
> btrfs_add_reserved_bytes() and btrfs_free_reserved_bytes().
>
> When space is reserved, btrfs_use_block_group_size_class() is called to
> set a block group's size class, specializing it for a specific allocation
> size to reduce fragmentation. However, when these reservations are
> subsequently freed (e.g., due to an error or transaction abort),
> btrfs_free_reserved_bytes() fails to perform the corresponding cleanup.
>
> This leads to a state leak where a block group remains stuck with a
> specific size class even if it contains no used or reserved bytes. This
> stale state causes find_free_extent to unnecessarily skip these block
> groups for mismatched size requests, leading to suboptimal allocation
> behavior.

Not necessarily always. If there are subsequent allocations for the
same extent size, then there's no problem at all.

There's more than skipping, it can cause allocation of new block
groups if there are none with a matching size class and there aren't
any without a size class.

I wonder if you observed this in practice and what kind of workload.

I think that should be rephrased because as it's stated it gives the
wrong idea that it will always cause bad behaviour, while in reality
that depends a lot on the workload.

>
> Fix this by resetting the size class to BTRFS_BG_SZ_NONE in
> btrfs_free_reserved_bytes() when the block group becomes completely
> empty.
>
> Fixes: 606d1bf10d7e ("btrfs: migrate the block group space accounting helpers")

What? That's completely wrong.

First, that commit only moved code around.
Secondly, that commit happened (2019) before we had support for block
group size classes (2022).

The proper commit would be 52bb7a2166af ("btrfs: introduce size class
to block group allocator").


> Signed-off-by: Jiasheng Jiang <jiashengjiangcool@...il.com>
> ---
> Changelog:
>
> v1 -> v2:
>
> 1. Inlined btrfs_maybe_reset_size_class() function.
> 2. Moved check below the reserved bytes decrement in btrfs_free_reserved_bytes().
> ---
>  fs/btrfs/block-group.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 08b14449fabe..8339ad001d3f 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -3867,6 +3867,12 @@ void btrfs_free_reserved_bytes(struct btrfs_block_group *cache, u64 num_bytes,
>         spin_lock(&cache->lock);
>         bg_ro = cache->ro;
>         cache->reserved -= num_bytes;
> +
> +       if (btrfs_block_group_should_use_size_class(cache)) {
> +               if (cache->used == 0 && cache->reserved == 0)
> +                       cache->size_class = BTRFS_BG_SZ_NONE;
> +       }
> +
>         if (is_delalloc)
>                 cache->delalloc_bytes -= num_bytes;
>         spin_unlock(&cache->lock);
> --
> 2.25.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ