linux-kernel - Re: [BUG] KCSAN: data-race in tick_nohz_idle_stop_tick / tick_nohz_next

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d9a3eb80-abaa-4453-a1ec-cc1200c58481@alu.unizg.hr>
Date: Sat, 9 Mar 2024 17:13:43 +0100
From: Mirsad Todorovac <mirsad.todorovac@....unizg.hr>
To: Sean Anderson <sean.anderson@...o.com>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [BUG] KCSAN: data-race in tick_nohz_idle_stop_tick /
 tick_nohz_next_event

Hi, Sean,

I am reducing Cc: so if you could please attach the decode_stacktrace.sh output of your build.

Thanks,
Mirsad

On 3/7/24 20:43, Sean Anderson wrote:
> On 8/18/23 10:15, Mirsad Todorovac wrote:
>>
>>
>> On 8/17/23 21:32, Mirsad Todorovac wrote:
>>> Hi,
>>>
>>> This is your friendly bug reporter. :-)
>>>
>>> The environment is the vanilla torvalds kernel on Ubuntu 22.04 LTS and a Ryzen 7950X assembled box.
>>>
>>> [ 7145.500079] ==================================================================
>>> [ 7145.500109] BUG: KCSAN: data-race in tick_nohz_idle_stop_tick / tick_nohz_next_event
>>>
>>> [ 7145.500139] write to 0xffffffff8a2bda30 of 4 bytes by task 0 on cpu 26:
>>> [ 7145.500155]  tick_nohz_idle_stop_tick+0x3b1/0x4a0
>>> [ 7145.500173]  do_idle+0x1e3/0x250
>>> [ 7145.500186]  cpu_startup_entry+0x20/0x30
>>> [ 7145.500199]  start_secondary+0x129/0x160
>>> [ 7145.500216]  secondary_startup_64_no_verify+0x17e/0x18b
>>>
>>> [ 7145.500245] read to 0xffffffff8a2bda30 of 4 bytes by task 0 on cpu 16:
>>> [ 7145.500261]  tick_nohz_next_event+0xe7/0x1e0
>>> [ 7145.500277]  tick_nohz_get_sleep_length+0xa7/0xe0
>>> [ 7145.500293]  menu_select+0x82/0xb90
>>> [ 7145.500309]  cpuidle_select+0x44/0x60
>>> [ 7145.500324]  do_idle+0x1c2/0x250
>>> [ 7145.500336]  cpu_startup_entry+0x20/0x30
>>> [ 7145.500350]  start_secondary+0x129/0x160
>>> [ 7145.500367]  secondary_startup_64_no_verify+0x17e/0x18b
>>>
>>> [ 7145.500392] value changed: 0x0000001a -> 0xffffffff
>>>
>>> [ 7145.500411] Reported by Kernel Concurrency Sanitizer on:
>>> [ 7145.500421] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G             L     6.5.0-rc6-net-cfg-kcsan-00038-g16931859a650 #35
>>> [ 7145.500439] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>>> [ 7145.500449] ==================================================================
>>>
>>> Please find dmesg log (what was available in the ring buffer) and the lshw output attached.
>>>
>>> NOTE: There are no proprietary drivers loaded and Canonical livepatch isn't working for this kernel,
>>> probably previous ACPI KCSAN BUGs turn this taint on.
>>>
>>> Thank you very much for your time evaluating this report.
>>
>> P.S.
>>
>> According to Mr. Heo, I will add the decoded stacktrace to this bug report, which ought to have been done
>> in the first place :-/
>>
>> [ 7145.500079] ==================================================================
>> [ 7145.500109] BUG: KCSAN: data-race in tick_nohz_idle_stop_tick / tick_nohz_next_event
>>
>> [ 7145.500139] write to 0xffffffff8a2bda30 of 4 bytes by task 0 on cpu 26:
>> [ 7145.500155] tick_nohz_idle_stop_tick (kernel/time/tick-sched.c:904 kernel/time/tick-sched.c:1130)
>> [ 7145.500173] do_idle (kernel/sched/idle.c:215 kernel/sched/idle.c:282)
>> [ 7145.500186] cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
>> [ 7145.500199] start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
>> [ 7145.500216] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:441)
>>
>> [ 7145.500245] read to 0xffffffff8a2bda30 of 4 bytes by task 0 on cpu 16:
>> [ 7145.500261] tick_nohz_next_event (kernel/time/tick-sched.c:868)
>> [ 7145.500277] tick_nohz_get_sleep_length (kernel/time/tick-sched.c:1250)
>> [ 7145.500293] menu_select (drivers/cpuidle/governors/menu.c:283)
>> [ 7145.500309] cpuidle_select (drivers/cpuidle/cpuidle.c:359)
>> [ 7145.500324] do_idle (kernel/sched/idle.c:208 kernel/sched/idle.c:282)
>> [ 7145.500336] cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
>> [ 7145.500350] start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
>> [ 7145.500367] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:441)
>>
>> [ 7145.500392] value changed: 0x0000001a -> 0xffffffff
>>
>> [ 7145.500411] Reported by Kernel Concurrency Sanitizer on:
>> [ 7145.500421] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G             L     6.5.0-rc6-net-cfg-kcsan-00038-g16931859a650 #35
>> [ 7145.500439] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>> [ 7145.500449] ==================================================================
>>
>> The code seems to be:
>>
>>
>>      /*
>>       * If this CPU is the one which updates jiffies, then give up
>>       * the assignment and let it be taken by the CPU which runs
>>       * the tick timer next, which might be this CPU as well. If we
>>       * don't drop this here the jiffies might be stale and
>>       * do_timer() never invoked. Keep track of the fact that it
>>       * was the one which had the do_timer() duty last.
>>       */
>>      if (cpu == tick_do_timer_cpu) {
>>          tick_do_timer_cpu = TICK_DO_TIMER_NONE;
>> →        ts->do_timer_last = 1;
>>      } else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) {
>>          ts->do_timer_last = 0;
>>
>>      }
>>
>> but I am not sure why this happens.
>>
>> Maybe a WRITE_ONCE(ts->do_timer_last, 1) is required?
>>
>> Or is it another KCSAN false positive ...
> 
> I have seen a few more KCSAN reports regarding this variable:
> 
> [  153.285199] ==================================================================
> [  153.292529] BUG: KCSAN: data-race in tick_nohz_next_event+0x15c/0x29c
> [  153.299118]
> [  153.300685] race at unknown origin, with read to 0xffffffc082aa1048 of 4 bytes by task 0 on cpu 2:
> [  153.309760]  tick_nohz_next_event+0x15c/0x29c
> [  153.314256]  tick_nohz_get_sleep_length+0xd8/0x118
> [  153.319186]  menu_select+0x1e0/0xc08
> [  153.322875]  cpuidle_select+0x4c/0x60
> [  153.326686]  do_idle+0x1e0/0x2b4
> [  153.330062]  cpu_startup_entry+0x34/0x3c
> [  153.334133]  secondary_start_kernel+0x1e8/0x21c
> [  153.338821]  __secondary_switched+0xb8/0xbc
> [  153.343143]
> [  153.344706] no locks held by swapper/2/0.
> [  153.348802] irq event stamp: 129586
> [  153.352370] hardirqs last  enabled at (129585): [<ffffffc081acaa2c>] el1_interrupt+0x40/0x50
> [  153.360964] hardirqs last disabled at (129586): [<ffffffc0800d8fec>] do_idle+0xc8/0x2b4
> [  153.369132] softirqs last  enabled at (129560): [<ffffffc08001071c>] __do_softirq+0x47c/0x500
> [  153.377786] softirqs last disabled at (129549): [<ffffffc080017134>] ____do_softirq+0x10/0x1c
> [  153.386457]
> [  153.388019] value changed: 0xffffffff -> 0x00000001
> [  153.392993]
> [  153.394555] Reported by Kernel Concurrency Sanitizer on:
> [  153.399947] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 6.6.20+ #102
> [  153.406595] Hardware name: xlnx,zynqmp (DT)
> [  153.410866] ==================================================================
> 
> and:
> 
> [   75.362055] ==================================================================
> [   75.369461] BUG: KCSAN: data-race in tick_sched_do_timer / tick_sched_do_timer
> [   75.376854]
> [   75.378426] write to 0xffffffc082aa1048 of 4 bytes by interrupt on cpu 1:
> [   75.385335]  tick_sched_do_timer+0x124/0x168
> [   75.389744]  tick_sched_timer+0x48/0xc8
> [   75.393720]  __hrtimer_run_queues+0x200/0x5f0
> [   75.398198]  hrtimer_interrupt+0x1b0/0x3f4
> [   75.402417]  arch_timer_handler_phys+0x40/0x54
> [   75.406983]  handle_percpu_devid_irq+0xf8/0x20c
> [   75.411679]  generic_handle_domain_irq+0x48/0x64
> [   75.416435]  gic_handle_irq+0xc4/0x100
> [   75.420298]  call_on_irq_stack+0x24/0x4c
> [   75.424360]  do_interrupt_handler+0xc0/0xc4
> [   75.428692]  el1_interrupt+0x34/0x50
> [   75.432415]  el1h_64_irq_handler+0x18/0x24
> [   75.436668]  el1h_64_irq+0x64/0x68
> [   75.440184]  percpu_counter_add_batch+0xd8/0x1b8
> [   75.444967]  set_pte_range+0x14c/0x2c8
> [   75.448838]  filemap_map_pages+0x2dc/0xa38
> [   75.453091]  __handle_mm_fault+0x1304/0x17dc
> [   75.457492]  handle_mm_fault+0xa0/0x374
> [   75.461450]  do_page_fault+0x1f8/0x620
> [   75.465365]  do_translation_fault+0x118/0x138
> [   75.469887]  do_mem_abort+0x5c/0xe4
> [   75.473516]  el0_da+0x5c/0xac
> [   75.476623]  el0t_64_sync_handler+0xe4/0x158
> [   75.481050]  el0t_64_sync+0x190/0x194
> [   75.484834]
> [   75.486405] 5 locks held by (d-logind)/293:
> [   75.490693]  #0: ffffff88076753c0 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x15c/0x5c0
> [   75.500441]  #1: ffffffc082b042f0 (rcu_read_lock){....}-{1:2}, at: __handle_mm_fault+0x1260/0x17dc
> [   75.509755]  #2: ffffffc082b042f0 (rcu_read_lock){....}-{1:2}, at: filemap_map_pages+0x0/0xa38
> [   75.518748]  #3: ffffffc082b042f0 (rcu_read_lock){....}-{1:2}, at: __pte_offset_map+0x0/0x2a0
> [   75.527662]  #4: ffffff880762b018 (ptlock_ptr(ptdesc)){+.+.}-{2:2}, at: __pte_offset_map_lock+0xa4/0x264
> [   75.537488] irq event stamp: 1761
> [   75.540890] hardirqs last  enabled at (1760): [<ffffffc080bc1f78>] percpu_counter_add_batch+0x160/0x1b8
> [   75.550465] hardirqs last disabled at (1761): [<ffffffc081aca990>] el1_interrupt+0x24/0x50
> [   75.558893] softirqs last  enabled at (1534): [<ffffffc08001071c>] __do_softirq+0x47c/0x500
> [   75.567382] softirqs last disabled at (1529): [<ffffffc080017134>] ____do_softirq+0x10/0x1c
> [   75.575889]
> [   75.577460] read to 0xffffffc082aa1048 of 4 bytes by interrupt on cpu 2:
> [   75.584273]  tick_sched_do_timer+0x48/0x168
> [   75.588596]  tick_sched_timer+0x48/0xc8
> [   75.592572]  __hrtimer_run_queues+0x200/0x5f0
> [   75.597050]  hrtimer_interrupt+0x1b0/0x3f4
> [   75.601269]  arch_timer_handler_phys+0x40/0x54
> [   75.605835]  handle_percpu_devid_irq+0xf8/0x20c
> [   75.610531]  generic_handle_domain_irq+0x48/0x64
> [   75.615287]  gic_handle_irq+0xc4/0x100
> [   75.619150]  call_on_irq_stack+0x24/0x4c
> [   75.623204]  do_interrupt_handler+0xc0/0xc4
> [   75.627535]  el1_interrupt+0x34/0x50
> [   75.631258]  el1h_64_irq_handler+0x18/0x24
> [   75.635512]  el1h_64_irq+0x64/0x68
> [   75.639027]  lock_acquire+0x13c/0x2f0
> [   75.642820]  count_memcg_event_mm.part.0+0x70/0x2a8
> [   75.647829]  handle_mm_fault+0x1d8/0x374
> [   75.651874]  do_page_fault+0x1f8/0x620
> [   75.655789]  do_mem_abort+0x5c/0xe4
> [   75.659417]  el0_da+0x5c/0xac
> [   75.662524]  el0t_64_sync_handler+0xe4/0x158
> [   75.666951]  el0t_64_sync+0x190/0x194
> [   75.670736]
> [   75.672307] 2 locks held by systemd/1:
> [   75.676152]  #0: ffffff88042627b0 (&vma->vm_lock->lock){++++}-{3:3}, at: lock_vma_under_rcu+0x15c/0x5c0
> [   75.685900]  #1: ffffffc082b042f0 (rcu_read_lock){....}-{1:2}, at: count_memcg_event_mm.part.0+0x0/0x2a8
> [   75.695726] irq event stamp: 576347
> [   75.699302] hardirqs last  enabled at (576346): [<ffffffc08001af70>] local_daif_restore+0x38/0x54
> [   75.708338] hardirqs last disabled at (576347): [<ffffffc081aca990>] el1_interrupt+0x24/0x50
> [   75.716940] softirqs last  enabled at (576161): [<ffffffc08001071c>] __do_softirq+0x47c/0x500
> [   75.725602] softirqs last disabled at (576154): [<ffffffc080017134>] ____do_softirq+0x10/0x1c
> [   75.734283]
> [   75.735854] value changed: 0xffffffff -> 0x00000000
> [   75.740836]
> [   75.742407] Reported by Kernel Concurrency Sanitizer on:
> [   75.747808] CPU: 2 PID: 1 Comm: systemd Not tainted 6.6.20+ #101
> [   75.754290] Hardware name: xlnx,zynqmp (DT)
> [   75.758563] ==================================================================
> 
> The justification seems to be in tick_sched_do_timer:
> 
>          /*
>           * Check if the do_timer duty was dropped. We don't care about
>           * concurrency: This happens only when the CPU in charge went
>           * into a long sleep. If two CPUs happen to assign themselves to
>           * this duty, then the jiffies update is still serialized by
>           * jiffies_lock.
>           *
>           * If nohz_full is enabled, this should not happen because the
>           * tick_do_timer_cpu never relinquishes.
>           */
> 
> with the other assignment being in tick_nohz_stop_tick. I'm not familiar
> enough with this code to say whether we should be using READ/WRITE_ONCE
> or maybe just data_race (as would be implied by the comment above).
> 
> --Sean
> 
> [Embedded World 2024, SECO SpA]<https://www.messe-ticket.de/Nuernberg/embeddedworld2024/Register/ew24517689>