[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL3q7H6NVxiEHQfyMWbNQuszw74AyQqXo6_K9KzyKs=VRvK0yA@mail.gmail.com>
Date: Thu, 26 Jun 2025 13:37:42 +0100
From: Filipe Manana <fdmanana@...nel.org>
To: Hao-ran Zheng <zhenghaoran154@...il.com>
Cc: clm@...com, josef@...icpanda.com, dsterba@...e.com,
linux-btrfs@...r.kernel.org, linux-kernel@...r.kernel.org,
baijiaju1990@...il.com, zzzccc427@...il.com
Subject: Re: [PATCH] btrfs: Two data races in btrfs
On Thu, Jun 26, 2025 at 4:08 AM Hao-ran Zheng <zhenghaoran154@...il.com> wrote:
>
> Hello maintainers,
>
> I would like to report two data race bugs we discovered in BTRFS
> filesystem on Linux kernel v6.16-rc3. These issues were identified
> using our data race detector.
This sort of text is not proper for a patch... Saying hello and saying
that you are reporting is totally irrelevant and odd.
And what do you mean by "our data race detector"? Is it something else
other than KCSAN, something not in the linux kernel tree?
>
> These issues were deemed non-critical after evaluation, as they
> do not impact core functionality or security. To minimize
> performance overhead while ensuring clarity for future maintenance,
> I have annotated them with data_race() macros.
You have to explain things in far more detail.
To me it seems you are sprinkling data_race() randomly just because
it's fine in some other places.
You don't explain why "these issues were deemed non-critical after evaluation".
>
> Below is a summary of the findings:
>
> ---
>
> ============ DATARACE ============
> extent_write_cache_pages fs/btrfs/extent_io.c:2439 [inline]
> btrfs_writepages+0x34fc/0x3d20 fs/btrfs/extent_io.c:2376
> do_writepages+0x302/0x7c0 mm/page-writeback.c:2687
> filemap_fdatawrite_wbc mm/filemap.c:389 [inline]
> __filemap_fdatawrite_range mm/filemap.c:422 [inline]
> filemap_fdatawrite_range+0x145/0x1d0 mm/filemap.c:440
> btrfs_fdatawrite_range fs/btrfs/file.c:3701 [inline]
> start_ordered_ops fs/btrfs/file.c:1439 [inline]
> btrfs_sync_file+0x6e7/0x1d70 fs/btrfs/file.c:1550
> generic_write_sync include/linux/fs.h:2970 [inline]
> btrfs_do_write_iter+0xd0c/0x12f0 fs/btrfs/file.c:1391
> btrfs_file_write_iter+0x3d/0x60 fs/btrfs/file.c:1401
> new_sync_write fs/read_write.c:586 [inline]
> vfs_write+0x940/0xd10 fs/read_write.c:679
> ksys_write+0x116/0x200 fs/read_write.c:731
> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> do_syscall_64+0xc9/0x1a0 arch/x86/entry/common.c:83
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 0x0
> ============OTHER_INFO============
> extent_write_cache_pages fs/btrfs/extent_io.c:2439 [inline]
> btrfs_writepages+0x34fc/0x3d20 fs/btrfs/extent_io.c:2376
> do_writepages+0x302/0x7c0 mm/page-writeback.c:2687
> filemap_fdatawrite_wbc mm/filemap.c:389 [inline]
> __filemap_fdatawrite_range mm/filemap.c:422 [inline]
> filemap_fdatawrite_range+0x145/0x1d0 mm/filemap.c:440
> btrfs_fdatawrite_range fs/btrfs/file.c:3701 [inline]
> start_ordered_ops fs/btrfs/file.c:1439 [inline]
> btrfs_sync_file+0x509/0x1d70 fs/btrfs/file.c:1521
> generic_write_sync include/linux/fs.h:2970 [inline]
> btrfs_do_write_iter+0xd0c/0x12f0 fs/btrfs/file.c:1391
> btrfs_file_write_iter+0x3d/0x60 fs/btrfs/file.c:1401
> new_sync_write fs/read_write.c:586 [inline]
> vfs_write+0x940/0xd10 fs/read_write.c:679
> ksys_write+0x116/0x200 fs/read_write.c:731
> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> do_syscall_64+0xc9/0x1a0 arch/x86/entry/common.c:83
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> =================END==============
>
> ===========================DATA_RACE===========================
> btrfs_inode_safe_disk_i_size_write+0x144/0x190 fs/btrfs/file-item.c:65
> btrfs_finish_one_ordered+0x999/0x1330 fs/btrfs/inode.c:3203
> btrfs_finish_ordered_io+0x33/0x50 fs/btrfs/inode.c:3308
> finish_ordered_fn+0x3a/0x50 fs/btrfs/ordered-data.c:331
> btrfs_work_helper+0x199/0x6c0 fs/btrfs/async-thread.c:314
> process_one_work kernel/workqueue.c:3238 [inline]
> process_scheduled_works+0x21f/0x520 kernel/workqueue.c:3319
> worker_thread+0x323/0x4a0 kernel/workqueue.c:3400
> kthread+0x2d5/0x300 kernel/kthread.c:464
> ret_from_fork+0x4d/0x60 arch/x86/kernel/process.c:148
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> ============OTHER_INFO============
> fill_stack_inode_item fs/btrfs/delayed-inode.c:1809 [inline]
> btrfs_delayed_update_inode+0x1ab/0xa40 fs/btrfs/delayed-inode.c:1931
> btrfs_update_inode+0x128/0x270 fs/btrfs/inode.c:4156
> btrfs_setxattr_trans+0x143/0x280 fs/btrfs/xattr.c:266
> btrfs_xattr_handler_set+0xb7/0xf0 fs/btrfs/xattr.c:380
> __vfs_setxattr+0x21e/0x240 fs/xattr.c:200
> __vfs_setxattr_noperm+0xa5/0x2d0 fs/xattr.c:234
> vfs_setxattr+0xd5/0x1d0 fs/xattr.c:321
> do_setxattr fs/xattr.c:636 [inline]
> file_setxattr+0xb0/0x110 fs/xattr.c:646
> path_setxattrat+0x217/0x260 fs/xattr.c:711
> __do_sys_fsetxattr fs/xattr.c:761 [inline]
> __se_sys_fsetxattr fs/xattr.c:758 [inline]
> __x64_sys_fsetxattr+0x2c/0x40 fs/xattr.c:758
> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> do_syscall_64+0xc9/0x1a0 arch/x86/entry/common.c:83
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> =================================
>
> Signed-off-by: Hao-ran Zheng <zhenghaoran154@...il.com>
> ---
> fs/btrfs/extent_io.c | 2 +-
> fs/btrfs/file-item.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 849199768664..0c03fafc3ae0 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2436,7 +2436,7 @@ static int extent_write_cache_pages(struct address_space *mapping,
> }
>
> if (wbc->range_cyclic || (wbc->nr_to_write > 0 && range_whole))
> - mapping->writeback_index = done_index;
> + data_race(mapping->writeback_index = done_index);
>
> btrfs_add_delayed_iput(BTRFS_I(inode));
> return ret;
> diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
> index 54d523d4f421..15572e79b6de 100644
> --- a/fs/btrfs/file-item.c
> +++ b/fs/btrfs/file-item.c
> @@ -61,7 +61,7 @@ void btrfs_inode_safe_disk_i_size_write(struct btrfs_inode *inode, u64 new_i_siz
> i_size = min(i_size, end + 1);
> else
> i_size = 0;
> - inode->disk_i_size = i_size;
> + data_race(inode->disk_i_size = i_size);
These are two completely different cases, each should be in a
dedicated patch with a proper analysis.
At least this one for disk_i_size, I don't think data_race() is a good solution.
It doesn't prevent store and load tearing, which would result in an
inode item with a bogus size.
Please provide a rationale for the proposed solution for each case.
We have gone through this in a patch you sent in the past (commit
5324c4e10e9c2ce307a037e904c0d9671d7137d9), and there data_race() was
ok because getting a stale value or some weird value due to the result
of a load/store tearing would only makes us take unnecessary locks,
therefore not affecting correctness - it's this type of analysis that
you should place in a change log.
Thanks.
> out_unlock:
> spin_unlock(&inode->lock);
> }
> --
> 2.34.1
>
>
Powered by blists - more mailing lists