[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPjX3FeaL2+oRz81OEkLKjWwr1XuOOa3t-kgCrc51we-C-GVng@mail.gmail.com>
Date: Tue, 18 Feb 2025 09:08:23 +0100
From: Daniel Vacek <neelx@...e.com>
To: Filipe Manana <fdmanana@...nel.org>
Cc: Hao-ran Zheng <zhenghaoran154@...il.com>, clm@...com, josef@...icpanda.com,
dsterba@...e.com, linux-btrfs@...r.kernel.org, linux-kernel@...r.kernel.org,
baijiaju1990@...il.com, 21371365@...a.edu.cn
Subject: Re: [PATCH] btrfs: fix data race when accessing the block_group's
used field
On Mon, 10 Feb 2025 at 12:11, Filipe Manana <fdmanana@...nel.org> wrote:
>
> On Sat, Feb 8, 2025 at 7:38 AM Hao-ran Zheng <zhenghaoran154@...il.com> wrote:
> >
> > A data race may occur when the function `btrfs_discard_queue_work()`
> > and the function `btrfs_update_block_group()` is executed concurrently.
> > Specifically, when the `btrfs_update_block_group()` function is executed
> > to lines `cache->used = old_val;`, and `btrfs_discard_queue_work()`
> > is accessing `if(block_group->used == 0)` will appear with data race,
> > which may cause block_group to be placed unexpectedly in discard_list or
> > discard_unused_list. The specific function call stack is as follows:
> >
> > ============DATA_RACE============
> > btrfs_discard_queue_work+0x245/0x500 [btrfs]
> > __btrfs_add_free_space+0x3066/0x32f0 [btrfs]
> > btrfs_add_free_space+0x19a/0x200 [btrfs]
> > unpin_extent_range+0x847/0x2120 [btrfs]
> > btrfs_finish_extent_commit+0x9a3/0x1840 [btrfs]
> > btrfs_commit_transaction+0x5f65/0xc0f0 [btrfs]
> > transaction_kthread+0x764/0xc20 [btrfs]
> > kthread+0x292/0x330
> > ret_from_fork+0x4d/0x80
> > ret_from_fork_asm+0x1a/0x30
> > ============OTHER_INFO============
> > btrfs_update_block_group+0xa9d/0x2430 [btrfs]
> > __btrfs_free_extent+0x4f69/0x9920 [btrfs]
> > __btrfs_run_delayed_refs+0x290e/0xd7d0 [btrfs]
> > btrfs_run_delayed_refs+0x317/0x770 [btrfs]
> > flush_space+0x388/0x1440 [btrfs]
> > btrfs_preempt_reclaim_metadata_space+0xd65/0x14c0 [btrfs]
> > process_scheduled_works+0x716/0xf10
> > worker_thread+0xb6a/0x1190
> > kthread+0x292/0x330
> > ret_from_fork+0x4d/0x80
> > ret_from_fork_asm+0x1a/0x30
> > =================================
> >
> > Although the `block_group->used` was checked again in the use of the
> > `peek_discard_list` function, considering that `block_group->used` is
> > a 64-bit variable, we still think that the data race here is an
> > unexpected behavior. It is recommended to add `READ_ONCE` and
> > `WRITE_ONCE` to read and write.
> >
> > Signed-off-by: Hao-ran Zheng <zhenghaoran154@...il.com>
> > ---
> > fs/btrfs/block-group.c | 4 ++--
> > fs/btrfs/discard.c | 2 +-
> > 2 files changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> > index c0a8f7d92acc..c681b97f6835 100644
> > --- a/fs/btrfs/block-group.c
> > +++ b/fs/btrfs/block-group.c
> > @@ -3678,7 +3678,7 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans,
> > old_val = cache->used;
> > if (alloc) {
> > old_val += num_bytes;
> > - cache->used = old_val;
> > + WRITE_ONCE(cache->used, old_val);
> > cache->reserved -= num_bytes;
> > cache->reclaim_mark = 0;
> > space_info->bytes_reserved -= num_bytes;
> > @@ -3690,7 +3690,7 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans,
> > spin_unlock(&space_info->lock);
> > } else {
> > old_val -= num_bytes;
> > - cache->used = old_val;
> > + WRITE_ONCE(cache->used, old_val);
> > cache->pinned += num_bytes;
> > btrfs_space_info_update_bytes_pinned(space_info, num_bytes);
> > space_info->bytes_used -= num_bytes;
> > diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
> > index e815d165cccc..71c57b571d50 100644
> > --- a/fs/btrfs/discard.c
> > +++ b/fs/btrfs/discard.c
> > @@ -363,7 +363,7 @@ void btrfs_discard_queue_work(struct btrfs_discard_ctl *discard_ctl,
> > if (!block_group || !btrfs_test_opt(block_group->fs_info, DISCARD_ASYNC))
> > return;
> >
> > - if (block_group->used == 0)
> > + if (READ_ONCE(block_group->used) == 0)
>
> There are at least 3 more places in discard.c where we access ->used
> without being under the protection of the block group's spinlock.
> So let's fix this for all places and not just a single one...
>
> Also, this is quite ugly to spread READ_ONCE/WRITE_ONCE all over the place.
> What we typically do in btrfs is to add helpers that hide them, see
> block-rsv.h for example.
>
> Also, I don't think we need READ_ONCE/WRITE_ONCE.
> We could use data_race(), though I think that could be subject to
> load/store tearing, or just take the lock.
> So adding a helper like this to block-group.h:
>
> static inline u64 btrfs_block_group_used(struct btrfs_block_group *bg)
> {
> u64 ret;
>
> spin_lock(&bg->lock);
> ret = bg->used;
> spin_unlock(&bg->lock);
>
> return ret;
> }
Would memory barriers be sufficient here? Taking a lock just for
reading one member seems excessive...
> And then use btrfs_bock_group_used() everywhere in discard.c where we
> aren't holding a block group's lock.
>
> Thanks.
>
>
> > add_to_discard_unused_list(discard_ctl, block_group);
> > else
> > add_to_discard_list(discard_ctl, block_group);
> > --
> > 2.34.1
> >
> >
>
Powered by blists - more mailing lists