lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 18 Sep 2017 15:24:59 -0400 From: Steven Rostedt <rostedt@...dmis.org> To: Thomas Gleixner <tglx@...utronix.de> Cc: LKML <linux-kernel@...r.kernel.org>, Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>, "H. Peter Anvin" <hpa@...or.com> Subject: [LOCKDEP SPLAT 4.14-rc1] Conflict between smpboot thread create completions and get_online_cpus I hit the following lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 4.14.0-rc1-test+ #99 Not tainted ------------------------------------------------------ watchdog/3/27 is trying to acquire lock: (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8100ea07>] release_ds_buffers+0x27/0xb0 but now in release context of a crosslock acquired at the following: ((complete)&self->parked){+.+.}, at: [<ffffffff810ca616>] kthread_park+0x46/0x60 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 ((complete)&self->parked){+.+.}: __lock_acquire+0x10af/0x1110 lock_acquire+0xea/0x1f0 wait_for_completion+0x52/0x1a0 kthread_park+0x46/0x60 __smpboot_create_thread.part.3+0x7d/0xf0 smpboot_register_percpu_thread_cpumask+0xb4/0x130 spawn_ksoftirqd+0x3b/0x45 do_one_initcall+0x43/0x171 kernel_init_freeable+0x134/0x270 kernel_init+0xe/0x101 ret_from_fork+0x2a/0x40 -> #1 (smpboot_threads_lock){+.+.}: __lock_acquire+0x10af/0x1110 lock_acquire+0xea/0x1f0 __mutex_lock+0x86/0x990 mutex_lock_nested+0x1b/0x20 smpboot_register_percpu_thread_cpumask+0x59/0x130 spawn_ksoftirqd+0x3b/0x45 do_one_initcall+0x43/0x171 kernel_init_freeable+0x134/0x270 kernel_init+0xe/0x101 ret_from_fork+0x2a/0x40 -> #0 (cpu_hotplug_lock.rw_sem){++++}: cpus_read_lock+0x57/0xc0 release_ds_buffers+0x27/0xb0 x86_release_hardware+0x8f/0xa0 hw_perf_event_destroy+0xe/0x20 _free_event+0xca/0x3d0 other info that might help us debug this: Chain exists of: cpu_hotplug_lock.rw_sem --> smpboot_threads_lock --> (complete)&self->parked Possible unsafe locking scenario by crosslock: CPU0 CPU1 ---- ---- lock(smpboot_threads_lock); lock((complete)&self->parked); lock(cpu_hotplug_lock.rw_sem); unlock((complete)&self->parked); *** DEADLOCK *** 1 lock held by watchdog/3/27: #0: (&x->wait){....}, at: [<ffffffff810f59bd>] complete+0x1d/0x60 stack backtrace: CPU: 3 PID: 27 Comm: watchdog/3 Not tainted 4.14.0-rc1-test+ #99 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 Call Trace: dump_stack+0x86/0xcf print_circular_bug+0x209/0x300 check_prev_add+0x3be/0x700 ? trace_event_raw_event_lock+0xf0/0xf0 lock_commit_crosslock+0x439/0x5b0 ? lock_commit_crosslock+0x439/0x5b0 complete+0x29/0x60 ? __kthread_parkme+0x45/0xa0 __kthread_parkme+0x71/0xa0 ? smpboot_thread_fn+0x2d/0x280 kthread_parkme+0x24/0x40 smpboot_thread_fn+0xac/0x280 kthread+0x152/0x190 ? sort_range+0x30/0x30 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x2a/0x40 NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. And adding some debugging where the lockdep splat happened: watchdog-27 3.... 764857us : cpus_read_lock+0x5/0xc0: lock cpu watchdog-27 3.... 764858us : <stack trace> => x86_release_hardware+0x8f/0xa0 => hw_perf_event_destroy+0xe/0x20 => _free_event+0xca/0x3d0 => put_event+0x14/0x20 => perf_event_release_kernel+0x236/0x330 => watchdog_nmi_disable+0x49/0x70 => watchdog_disable+0x4d/0x60 => smpboot_thread_fn+0x138/0x280 => kthread+0x152/0x190 => ret_from_fork+0x2a/0x40 watchdog-27 3.... 764917us : cpus_read_unlock+0x5/0xa0: unlock cpu watchdog-27 3d..1 764918us : lock_commit_crosslock+0xc1/0x5b0: xhlock=132 I found that the following chain is of issue: Creating the per_cpu smpboot hotplug thread function: mutex_lock(&smpboot_threads_lock); tsk = kthread_create_on_cpu(smpboot_thread_fn, td, cpu, ht->thread_comm); kthread_park(tsk); wait_for_completion(&kthread->parked); In the mean time, we have that thread doing: kthread() { smpboot_thread_fn() { watchdog_disable() { watchdog_nmi_disable() { perf_event_release_kernel() { put_event() { _free_event() { event->destroy() { (hw_perf_event_destroy()) x86_release_hardware() { release_ds_buffers() { get_online_cpus(); put_online_cpus(); } } } } } } } } kthread_parkme() { complete(&self->park); Or the abridged version: creating a task under smpboot_threads_lock and then calling wait_for_completion() having a case where get_online_cpus() is taken before smpboot_threads_lock Then the task you are waiting for, calls get_online_cpus() before doing the complete. Creates a possibility of a deadlock, because we can have: TASK1 TASK2 TASK3 ----- ----- ----- mutex_lock(&smpboot_threads_lock); wait_for_completion(); /* blocked */ get_online_cpus(); mutex_lock(&smpboot_threads_lock); /* blocked */ get_online_cpus(); /* block */ /* Still needs to do: */ put_online_cpus(); complete(); Sadly, this splat happens at boot up and prevents me from testing my other locking I'm playing with. -- Steve
Powered by blists - more mailing lists