linux-kernel - Re: BUG: KCSAN: data-race in btrfs_calculate_inode_block_rsv_size [btrfs] / btrfs_use_block

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <b936c22f-5eb0-4f12-a6d5-7f69a42286f7@alu.unizg.hr>
Date:   Thu, 21 Sep 2023 21:22:07 +0200
From:   Mirsad Todorovac <mirsad.todorovac@....unizg.hr>
To:     dsterba@...e.cz
Cc:     linux-btrfs@...r.kernel.org, Josef Bacik <josef@...icpanda.com>,
        Chris Mason <clm@...com>, David Sterba <dsterba@...e.com>,
        linux-kernel@...r.kernel.org
Subject: Re: BUG: KCSAN: data-race in btrfs_calculate_inode_block_rsv_size
 [btrfs] / btrfs_use_block_rsv [btrfs] [EXPERIMENTAL PATCH]

On 9/20/23 17:29, David Sterba wrote:
> On Wed, Sep 20, 2023 at 08:18:35AM +0200, Mirsad Todorovac wrote:
>> Hi,
>>
>> This is your friendly bug reporter again.
>>
>> Please don't throw stuff at me, as I found another KCSAN data-race problem.
>>
>> I feel like a boy who cried wolf ...
>>
>> I hope this will get some attention, as it looks this time like a real btrfs problem that could cause
>> the kernel module to make a wrong turn when managing storage in different threads simultaneously and
>> lead to the corruption of data. However, I do not have an example of this corruption, it is by now only
>> theoretical in this otherwise great filesystem.
>>
>> In fact, I can announce quite a number of KCSAN bugs already in dmesg log:
>>
>>      # of
>> occuren
>>       ces problematic function
>> -------------------------------------------
>>       182 __bitmap_and+0xa3/0x110
>>         2 __bitmap_weight+0x62/0xa0
>>       138 __call_rcu_common.constprop.0
>>         3 __cgroup_account_cputime
>>         1 __dentry_kill
>>         3 __mod_lruvec_page_state
>>        15 __percpu_counter_compare
>>         1 __percpu_counter_sum+0x8f/0x120
>>         1 acpi_ut_acquire_mutex
>>         2 amdgpu_fence_emit
>>         1 btrfs_calculate_inode_block_rsv_size
>>         1 btrfs_page_set_uptodate
>>        28 copy_from_read_buf
>>         3 d_add
>>         3 d_splice_alias
>>         1 delayacct_add_tsk+0x10d/0x630
>>         7 do_epoll_ctl
>>         1 do_vmi_align_munmap
>>        86 drm_sched_entity_is_ready
>>         4 drm_sched_entity_pop_job
>>         3 enqueue_timer
>>         1 finish_fault+0xde/0x360
>>         2 generic_fillattr
>>         2 getrusage
>>         9 getrusage+0x3ba/0xaa0
>>         1 getrusage+0x3df/0xaa0
>>         6 inode_needs_update_time
>>         1 inode_set_ctime_current
>>         1 inode_update_timestamps
>>         3 kernfs_refresh_inode
>>        22 ktime_get_mono_fast_ns+0x87/0x120
>>        13 ktime_get_mono_fast_ns+0xb0/0x120
>>        24 ktime_get_mono_fast_ns+0xc0/0x120
>>        79 mas_topiary_replace
>>        12 mas_wr_modify
>>        61 mas_wr_node_store
>>         1 memchr_inv+0x71/0x160
>>         1 memchr_inv+0xcf/0x160
>>        19 n_tty_check_unthrottle
>>         5 n_tty_kick_worker
>>        35 n_tty_poll
>>        32 n_tty_read
>>         1 n_tty_read+0x5f8/0xaf0
>>         3 osq_lock
>>        27 process_one_work
>>         4 process_one_work+0x169/0x700
>>         2 rcu_implicit_dynticks_qs
>>         1 show_stat+0x45b/0xb70
>>         3 task_mem
>>       344 tick_nohz_idle_stop_tick
>>        32 tick_nohz_next_event
>>         1 tick_nohz_next_event+0xe7/0x1e0
>>        90 tick_sched_do_timer
>>         5 tick_sched_do_timer+0x2c/0x120
>>         1 wbt_done
>>         1 wbt_issue
>>         2 wq_worker_tick
>>        37 xas_clear_mark
>>
>> ------------------------------------------------------
>>
>> This report is from a vanilla torvalds tree 6.6-rc2 kernel on Ubuntu 22.04:
>>
>> [13429.116126] ==================================================================
>> [13429.116794] BUG: KCSAN: data-race in btrfs_calculate_inode_block_rsv_size [btrfs] / btrfs_use_block_rsv [btrfs]
> 
> Thanks for the report.  Some data races are known to happen in the
> reservation code but all the critical changes are done under locks, so
> an optimistic check may skip locking to check a status but then it's
> done properly again under a lock. Generally speaking.
> 
> We had several reports from static checkers and at least in one case we
> added an annotation so KCSAN does not complain:
> 
> https://git.kernel.org/linus/748f553c3c4c4f175c6c834358632aff802d72cf
> 
> The original report is at
> 
> https://lore.kernel.org/linux-btrfs/CAAwBoOJDjei5Hnem155N_cJwiEkVwJYvgN-tQrwWbZQGhFU=cA@mail.gmail.com/
> 
> I have briefly looked at your report, it seems to be different from the
> one above but still matches the general approach to the reservations. If
> it's a false flag then we can add another wrapper with the annotation,
> unless it's a real bug.

Thank you for your bug report evaluation.

I cannot do more but pass on what KCSAN provides - my experience with btrfs is far from
required (on the level of fresh user).

However, without attempting to argue, it seems to be possible that there is a data-race,
because the read side is in the function is not protected by a lock, and theoretically
the block_rsv->failfast can change by the write-side thread while the read-side thread
is using various parts of the block_rsv structure w/o a read lock.

Best regards,
Mirsad Todorovac