[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL3q7H5JgQkFavwrjOsvxDt9mMjVUK_nPOha-WYU-muLW=Orug@mail.gmail.com>
Date: Mon, 10 Feb 2025 11:09:43 +0000
From: Filipe Manana <fdmanana@...nel.org>
To: Hao-ran Zheng <zhenghaoran154@...il.com>
Cc: clm@...com, josef@...icpanda.com, dsterba@...e.com,
linux-btrfs@...r.kernel.org, linux-kernel@...r.kernel.org,
baijiaju1990@...il.com, 21371365@...a.edu.cn
Subject: Re: [PATCH] btrfs: fix data race when accessing the block_group's
used field
On Sat, Feb 8, 2025 at 7:38 AM Hao-ran Zheng <zhenghaoran154@...il.com> wrote:
>
> A data race may occur when the function `btrfs_discard_queue_work()`
> and the function `btrfs_update_block_group()` is executed concurrently.
> Specifically, when the `btrfs_update_block_group()` function is executed
> to lines `cache->used = old_val;`, and `btrfs_discard_queue_work()`
> is accessing `if(block_group->used == 0)` will appear with data race,
> which may cause block_group to be placed unexpectedly in discard_list or
> discard_unused_list. The specific function call stack is as follows:
>
> ============DATA_RACE============
> btrfs_discard_queue_work+0x245/0x500 [btrfs]
> __btrfs_add_free_space+0x3066/0x32f0 [btrfs]
> btrfs_add_free_space+0x19a/0x200 [btrfs]
> unpin_extent_range+0x847/0x2120 [btrfs]
> btrfs_finish_extent_commit+0x9a3/0x1840 [btrfs]
> btrfs_commit_transaction+0x5f65/0xc0f0 [btrfs]
> transaction_kthread+0x764/0xc20 [btrfs]
> kthread+0x292/0x330
> ret_from_fork+0x4d/0x80
> ret_from_fork_asm+0x1a/0x30
> ============OTHER_INFO============
> btrfs_update_block_group+0xa9d/0x2430 [btrfs]
> __btrfs_free_extent+0x4f69/0x9920 [btrfs]
> __btrfs_run_delayed_refs+0x290e/0xd7d0 [btrfs]
> btrfs_run_delayed_refs+0x317/0x770 [btrfs]
> flush_space+0x388/0x1440 [btrfs]
> btrfs_preempt_reclaim_metadata_space+0xd65/0x14c0 [btrfs]
> process_scheduled_works+0x716/0xf10
> worker_thread+0xb6a/0x1190
> kthread+0x292/0x330
> ret_from_fork+0x4d/0x80
> ret_from_fork_asm+0x1a/0x30
> =================================
>
> Although the `block_group->used` was checked again in the use of the
> `peek_discard_list` function, considering that `block_group->used` is
> a 64-bit variable, we still think that the data race here is an
> unexpected behavior. It is recommended to add `READ_ONCE` and
> `WRITE_ONCE` to read and write.
>
> Signed-off-by: Hao-ran Zheng <zhenghaoran154@...il.com>
> ---
> fs/btrfs/block-group.c | 4 ++--
> fs/btrfs/discard.c | 2 +-
> 2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index c0a8f7d92acc..c681b97f6835 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -3678,7 +3678,7 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans,
> old_val = cache->used;
> if (alloc) {
> old_val += num_bytes;
> - cache->used = old_val;
> + WRITE_ONCE(cache->used, old_val);
> cache->reserved -= num_bytes;
> cache->reclaim_mark = 0;
> space_info->bytes_reserved -= num_bytes;
> @@ -3690,7 +3690,7 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans,
> spin_unlock(&space_info->lock);
> } else {
> old_val -= num_bytes;
> - cache->used = old_val;
> + WRITE_ONCE(cache->used, old_val);
> cache->pinned += num_bytes;
> btrfs_space_info_update_bytes_pinned(space_info, num_bytes);
> space_info->bytes_used -= num_bytes;
> diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
> index e815d165cccc..71c57b571d50 100644
> --- a/fs/btrfs/discard.c
> +++ b/fs/btrfs/discard.c
> @@ -363,7 +363,7 @@ void btrfs_discard_queue_work(struct btrfs_discard_ctl *discard_ctl,
> if (!block_group || !btrfs_test_opt(block_group->fs_info, DISCARD_ASYNC))
> return;
>
> - if (block_group->used == 0)
> + if (READ_ONCE(block_group->used) == 0)
There are at least 3 more places in discard.c where we access ->used
without being under the protection of the block group's spinlock.
So let's fix this for all places and not just a single one...
Also, this is quite ugly to spread READ_ONCE/WRITE_ONCE all over the place.
What we typically do in btrfs is to add helpers that hide them, see
block-rsv.h for example.
Also, I don't think we need READ_ONCE/WRITE_ONCE.
We could use data_race(), though I think that could be subject to
load/store tearing, or just take the lock.
So adding a helper like this to block-group.h:
static inline u64 btrfs_block_group_used(struct btrfs_block_group *bg)
{
u64 ret;
spin_lock(&bg->lock);
ret = bg->used;
spin_unlock(&bg->lock);
return ret;
}
And then use btrfs_bock_group_used() everywhere in discard.c where we
aren't holding a block group's lock.
Thanks.
> add_to_discard_unused_list(discard_ctl, block_group);
> else
> add_to_discard_list(discard_ctl, block_group);
> --
> 2.34.1
>
>
Powered by blists - more mailing lists