[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b936c22f-5eb0-4f12-a6d5-7f69a42286f7@alu.unizg.hr>
Date: Thu, 21 Sep 2023 21:22:07 +0200
From: Mirsad Todorovac <mirsad.todorovac@....unizg.hr>
To: dsterba@...e.cz
Cc: linux-btrfs@...r.kernel.org, Josef Bacik <josef@...icpanda.com>,
Chris Mason <clm@...com>, David Sterba <dsterba@...e.com>,
linux-kernel@...r.kernel.org
Subject: Re: BUG: KCSAN: data-race in btrfs_calculate_inode_block_rsv_size
[btrfs] / btrfs_use_block_rsv [btrfs] [EXPERIMENTAL PATCH]
On 9/20/23 17:29, David Sterba wrote:
> On Wed, Sep 20, 2023 at 08:18:35AM +0200, Mirsad Todorovac wrote:
>> Hi,
>>
>> This is your friendly bug reporter again.
>>
>> Please don't throw stuff at me, as I found another KCSAN data-race problem.
>>
>> I feel like a boy who cried wolf ...
>>
>> I hope this will get some attention, as it looks this time like a real btrfs problem that could cause
>> the kernel module to make a wrong turn when managing storage in different threads simultaneously and
>> lead to the corruption of data. However, I do not have an example of this corruption, it is by now only
>> theoretical in this otherwise great filesystem.
>>
>> In fact, I can announce quite a number of KCSAN bugs already in dmesg log:
>>
>> # of
>> occuren
>> ces problematic function
>> -------------------------------------------
>> 182 __bitmap_and+0xa3/0x110
>> 2 __bitmap_weight+0x62/0xa0
>> 138 __call_rcu_common.constprop.0
>> 3 __cgroup_account_cputime
>> 1 __dentry_kill
>> 3 __mod_lruvec_page_state
>> 15 __percpu_counter_compare
>> 1 __percpu_counter_sum+0x8f/0x120
>> 1 acpi_ut_acquire_mutex
>> 2 amdgpu_fence_emit
>> 1 btrfs_calculate_inode_block_rsv_size
>> 1 btrfs_page_set_uptodate
>> 28 copy_from_read_buf
>> 3 d_add
>> 3 d_splice_alias
>> 1 delayacct_add_tsk+0x10d/0x630
>> 7 do_epoll_ctl
>> 1 do_vmi_align_munmap
>> 86 drm_sched_entity_is_ready
>> 4 drm_sched_entity_pop_job
>> 3 enqueue_timer
>> 1 finish_fault+0xde/0x360
>> 2 generic_fillattr
>> 2 getrusage
>> 9 getrusage+0x3ba/0xaa0
>> 1 getrusage+0x3df/0xaa0
>> 6 inode_needs_update_time
>> 1 inode_set_ctime_current
>> 1 inode_update_timestamps
>> 3 kernfs_refresh_inode
>> 22 ktime_get_mono_fast_ns+0x87/0x120
>> 13 ktime_get_mono_fast_ns+0xb0/0x120
>> 24 ktime_get_mono_fast_ns+0xc0/0x120
>> 79 mas_topiary_replace
>> 12 mas_wr_modify
>> 61 mas_wr_node_store
>> 1 memchr_inv+0x71/0x160
>> 1 memchr_inv+0xcf/0x160
>> 19 n_tty_check_unthrottle
>> 5 n_tty_kick_worker
>> 35 n_tty_poll
>> 32 n_tty_read
>> 1 n_tty_read+0x5f8/0xaf0
>> 3 osq_lock
>> 27 process_one_work
>> 4 process_one_work+0x169/0x700
>> 2 rcu_implicit_dynticks_qs
>> 1 show_stat+0x45b/0xb70
>> 3 task_mem
>> 344 tick_nohz_idle_stop_tick
>> 32 tick_nohz_next_event
>> 1 tick_nohz_next_event+0xe7/0x1e0
>> 90 tick_sched_do_timer
>> 5 tick_sched_do_timer+0x2c/0x120
>> 1 wbt_done
>> 1 wbt_issue
>> 2 wq_worker_tick
>> 37 xas_clear_mark
>>
>> ------------------------------------------------------
>>
>> This report is from a vanilla torvalds tree 6.6-rc2 kernel on Ubuntu 22.04:
>>
>> [13429.116126] ==================================================================
>> [13429.116794] BUG: KCSAN: data-race in btrfs_calculate_inode_block_rsv_size [btrfs] / btrfs_use_block_rsv [btrfs]
>
> Thanks for the report. Some data races are known to happen in the
> reservation code but all the critical changes are done under locks, so
> an optimistic check may skip locking to check a status but then it's
> done properly again under a lock. Generally speaking.
>
> We had several reports from static checkers and at least in one case we
> added an annotation so KCSAN does not complain:
>
> https://git.kernel.org/linus/748f553c3c4c4f175c6c834358632aff802d72cf
>
> The original report is at
>
> https://lore.kernel.org/linux-btrfs/CAAwBoOJDjei5Hnem155N_cJwiEkVwJYvgN-tQrwWbZQGhFU=cA@mail.gmail.com/
>
> I have briefly looked at your report, it seems to be different from the
> one above but still matches the general approach to the reservations. If
> it's a false flag then we can add another wrapper with the annotation,
> unless it's a real bug.
Thank you for your bug report evaluation.
I cannot do more but pass on what KCSAN provides - my experience with btrfs is far from
required (on the level of fresh user).
However, without attempting to argue, it seems to be possible that there is a data-race,
because the read side is in the function is not protected by a lock, and theoretically
the block_rsv->failfast can change by the write-side thread while the read-side thread
is using various parts of the block_rsv structure w/o a read lock.
Best regards,
Mirsad Todorovac
Powered by blists - more mailing lists