lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202309251423.8ef230c0-oliver.sang@intel.com>
Date:   Mon, 25 Sep 2023 15:18:46 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Tejun Heo <tj@...nel.org>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        <linux-kernel@...r.kernel.org>, <linux-doc@...r.kernel.org>,
        <oliver.sang@...el.com>
Subject: [linus:master] [workqueue]  8a1dd1e547:
 BUG:KCSAN:data-race_in_print_report/wq_worker_tick



Hello,

kernel test robot noticed "BUG:KCSAN:data-race_in_print_report/wq_worker_tick" on:

commit: 8a1dd1e547c1a037692e7a6da6a76108108c72b1 ("workqueue: Track and monitor per-workqueue CPU time usage")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master 2cf0f715623872823a72e451243bbf555d10d032]
[test failed on linux-next/master 29e400e3ea486bf942b214769fc9778098114113]

in testcase: boot

compiler: gcc-12
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


we found various KCSAN issues happen on both parent/fbc, but below (1)
only happens on this commit and clean on parent after almost 1000 runs.
we don't have enough knowledge how connect the code change to the issue,
just report what we observed in our tests FYI.

6363845005202148 8a1dd1e547c1a037692e7a6da6a
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
        956:986         -0%         955:981   dmesg.BUG:KCSAN:data-race_in__mutex_add_waiter/print_report
          4:986         -0%           1:981   dmesg.BUG:KCSAN:data-race_in__mutex_remove_waiter/print_report
          1:986         -0%            :981   dmesg.BUG:KCSAN:data-race_in_do_timer/print_report
          1:986          0%           2:981   dmesg.BUG:KCSAN:data-race_in_enqueue_timer/print_report
          4:986         -0%           1:981   dmesg.BUG:KCSAN:data-race_in_print_report/process_one_work
          5:986         -0%           4:981   dmesg.BUG:KCSAN:data-race_in_print_report/threadfunc
          4:986          0%           4:981   dmesg.BUG:KCSAN:data-race_in_print_report/tick_nohz_handler
        175:986          2%         199:981   dmesg.BUG:KCSAN:data-race_in_print_report/tick_nohz_stop_tick
           :986          2%          24:981   dmesg.BUG:KCSAN:data-race_in_print_report/wq_worker_tick     <----- (1)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202309251423.8ef230c0-oliver.sang@intel.com


[  121.059354][    C0] ==================================================================
[  121.060132][    C0] BUG: KCSAN: data-race in print_report / wq_worker_tick
[  121.060132][    C0]
[  121.060132][    C0] write to 0xffff88810336eaa8 of 8 bytes by interrupt on cpu 1:
[ 121.060132][ C0] wq_worker_tick (kernel/workqueue.c:1140) 
[ 121.060132][ C0] scheduler_tick (kernel/sched/core.c:5639) 
[ 121.060132][ C0] update_process_times (kernel/time/timer.c:2078) 
[ 121.060132][ C0] tick_nohz_handler (kernel/time/tick-sched.c:1369) 
[ 121.060132][ C0] __sysvec_apic_timer_interrupt (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 arch/x86/include/asm/trace/irq_vectors.h:41 arch/x86/kernel/apic/apic.c:1113) 
[ 121.060132][ C0] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1106 (discriminator 14)) 
[ 121.060132][ C0] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:645) 
[ 121.060132][ C0] kcsan_setup_watchpoint (kernel/kcsan/core.c:704 (discriminator 7)) 
[ 121.060132][ C0] do_raw_spin_unlock (kernel/locking/spinlock_debug.c:99 kernel/locking/spinlock_debug.c:140) 
[ 121.060132][ C0] _raw_spin_unlock (arch/x86/include/asm/preempt.h:85 include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:186) 
[ 121.060132][ C0] __ww_mutex_lock+0x7fb/0xcfd 
[ 121.060132][ C0] ww_mutex_lock (kernel/locking/mutex.c:873) 
[ 121.060132][ C0] stress_one_work (kernel/locking/test-ww_mutex.c:545) 
[ 121.060132][ C0] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/workqueue.h:108 kernel/workqueue.c:2589) 
[ 121.060132][ C0] worker_thread (include/linux/list.h:292 kernel/workqueue.c:2736) 
[ 121.060132][ C0] kthread (kernel/kthread.c:381) 
[ 121.060132][ C0] ret_from_fork (arch/x86/entry/entry_64.S:314) 
[  121.060132][    C0]
[  121.060132][    C0] read to 0xffff88810336eaa8 of 8 bytes by interrupt on cpu 0:
[ 121.060132][ C0] print_report (kernel/kcsan/report.c:396) 
[ 121.060132][ C0] kcsan_report_known_origin (kernel/kcsan/report.c:504 kernel/kcsan/report.c:694) 
[ 121.060132][ C0] kcsan_setup_watchpoint (kernel/kcsan/core.c:678) 
[ 121.060132][ C0] wq_worker_tick (kernel/workqueue.c:1140) 
[ 121.060132][ C0] scheduler_tick (kernel/sched/core.c:5639) 
[ 121.060132][ C0] update_process_times (kernel/time/timer.c:2078) 
[ 121.060132][ C0] tick_nohz_handler (kernel/time/tick-sched.c:1369) 
[ 121.060132][ C0] __sysvec_apic_timer_interrupt (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 arch/x86/include/asm/trace/irq_vectors.h:41 arch/x86/kernel/apic/apic.c:1113) 
[ 121.060132][ C0] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1106 (discriminator 14)) 
[ 121.060132][ C0] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:645) 
[ 121.060132][ C0] queued_spin_lock_slowpath (include/linux/instrumented.h:69 include/linux/atomic/atomic-instrumented.h:27 arch/x86/include/asm/qspinlock.h:100 kernel/locking/qspinlock.c:327) 
[ 121.060132][ C0] do_raw_spin_lock (kernel/locking/spinlock_debug.c:93 kernel/locking/spinlock_debug.c:117) 
[ 121.060132][ C0] _raw_spin_lock (include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154) 
[ 121.060132][ C0] __ww_mutex_lock+0x941/0xcfd 
[ 121.060132][ C0] ww_mutex_lock (kernel/locking/mutex.c:873) 
[ 121.060132][ C0] stress_reorder_work (kernel/locking/test-ww_mutex.c:505) 
[ 121.060132][ C0] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/workqueue.h:108 kernel/workqueue.c:2589) 
[ 121.060132][ C0] worker_thread (include/linux/list.h:292 kernel/workqueue.c:2736) 
[ 121.060132][ C0] kthread (kernel/kthread.c:381) 
[ 121.060132][ C0] ret_from_fork (arch/x86/entry/entry_64.S:314) 
[  121.060132][    C0]
[  121.060132][    C0] value changed: 0x00000000001364d8 -> 0x00000000001368c0
[  121.060132][    C0]
[  121.060132][    C0] Reported by Kernel Concurrency Sanitizer on:
[  121.060132][    C0] CPU: 0 PID: 42 Comm: kworker/u4:3 Tainted: G                T  6.4.0-rc1-00009-g8a1dd1e547c1 #1
[  121.060132][    C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[  121.060132][    C0] Workqueue: test-ww_mutex stress_reorder_work
[  121.060132][    C0] ==================================================================
[  121.932570][    T1] All ww mutex selftests passed
[  121.942977][    T1] Initialise system trusted keyrings
[  121.950392][    T1] workingset: timestamp_bits=46 max_order=22 bucket_order=0
[  121.975697][    T1] NET: Registered PF_ALG protocol family
[  121.977547][    T1] Key type asymmetric registered
[  121.979060][    T1] Asymmetric key parser 'x509' registered
[  121.980897][    T1] Asymmetric key parser 'pkcs8' registered
[  121.982622][    T1]
[  121.982622][    T1] Start testing find_bit() with random-filled bitmap
[  122.008302][    T1] find_next_bit:                 8628942 ns, 163936 iterations
[  122.020467][    T1] find_next_zero_bit:            9789528 ns, 163745 iterations
[  122.032835][    T1] find_last_bit:                10001837 ns, 163935 iterations
[  122.248386][    T1] find_nth_bit:                213810108 ns,  16175 iterations
[  122.437809][    T1] find_first_bit:              187071076 ns,  16176 iterations



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230925/202309251423.8ef230c0-oliver.sang@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ