linux-kernel - [LOCKDEP SPLAT 4.14-rc1] Conflict between smpboot thread create completions and get_online

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170918152459.35c233d2@gandalf.local.home>
Date:   Mon, 18 Sep 2017 15:24:59 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        "H. Peter Anvin" <hpa@...or.com>
Subject: [LOCKDEP SPLAT 4.14-rc1] Conflict between smpboot thread create
 completions and get_online_cpus

I hit the following lockdep splat:

 ======================================================
 WARNING: possible circular locking dependency detected
 4.14.0-rc1-test+ #99 Not tainted
 ------------------------------------------------------
 watchdog/3/27 is trying to acquire lock:
  (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8100ea07>] release_ds_buffers+0x27/0xb0
 
 but now in release context of a crosslock acquired at the following:
  ((complete)&self->parked){+.+.}, at: [<ffffffff810ca616>] kthread_park+0x46/0x60
 
 which lock already depends on the new lock.
 
 
 the existing dependency chain (in reverse order) is:
 
 -> #2 ((complete)&self->parked){+.+.}:
        __lock_acquire+0x10af/0x1110
        lock_acquire+0xea/0x1f0
        wait_for_completion+0x52/0x1a0
        kthread_park+0x46/0x60
        __smpboot_create_thread.part.3+0x7d/0xf0
        smpboot_register_percpu_thread_cpumask+0xb4/0x130
        spawn_ksoftirqd+0x3b/0x45
        do_one_initcall+0x43/0x171
        kernel_init_freeable+0x134/0x270
        kernel_init+0xe/0x101
        ret_from_fork+0x2a/0x40
 
 -> #1 (smpboot_threads_lock){+.+.}:
        __lock_acquire+0x10af/0x1110
        lock_acquire+0xea/0x1f0
        __mutex_lock+0x86/0x990
        mutex_lock_nested+0x1b/0x20
        smpboot_register_percpu_thread_cpumask+0x59/0x130
        spawn_ksoftirqd+0x3b/0x45
        do_one_initcall+0x43/0x171
        kernel_init_freeable+0x134/0x270
        kernel_init+0xe/0x101
        ret_from_fork+0x2a/0x40
 
 -> #0 (cpu_hotplug_lock.rw_sem){++++}:
        cpus_read_lock+0x57/0xc0
        release_ds_buffers+0x27/0xb0
        x86_release_hardware+0x8f/0xa0
        hw_perf_event_destroy+0xe/0x20
        _free_event+0xca/0x3d0
 
 other info that might help us debug this:
 
 Chain exists of:
   cpu_hotplug_lock.rw_sem --> smpboot_threads_lock --> (complete)&self->parked
 
  Possible unsafe locking scenario by crosslock:
 
        CPU0                    CPU1
        ----                    ----
   lock(smpboot_threads_lock);
   lock((complete)&self->parked);
                                lock(cpu_hotplug_lock.rw_sem);
                                unlock((complete)&self->parked);
 
  *** DEADLOCK ***
 
 1 lock held by watchdog/3/27:
  #0:  (&x->wait){....}, at: [<ffffffff810f59bd>] complete+0x1d/0x60
 
 stack backtrace:
 CPU: 3 PID: 27 Comm: watchdog/3 Not tainted 4.14.0-rc1-test+ #99
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
 Call Trace:
  dump_stack+0x86/0xcf
  print_circular_bug+0x209/0x300
  check_prev_add+0x3be/0x700
  ? trace_event_raw_event_lock+0xf0/0xf0
  lock_commit_crosslock+0x439/0x5b0
  ? lock_commit_crosslock+0x439/0x5b0
  complete+0x29/0x60
  ? __kthread_parkme+0x45/0xa0
  __kthread_parkme+0x71/0xa0
  ? smpboot_thread_fn+0x2d/0x280
  kthread_parkme+0x24/0x40
  smpboot_thread_fn+0xac/0x280
  kthread+0x152/0x190
  ? sort_range+0x30/0x30
  ? kthread_create_on_node+0x40/0x40
  ret_from_fork+0x2a/0x40
 NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.


And adding some debugging where the lockdep splat happened:

 watchdog-27      3.... 764857us : cpus_read_lock+0x5/0xc0: lock cpu
 watchdog-27      3.... 764858us : <stack trace>
  => x86_release_hardware+0x8f/0xa0
  => hw_perf_event_destroy+0xe/0x20
  => _free_event+0xca/0x3d0
  => put_event+0x14/0x20
  => perf_event_release_kernel+0x236/0x330
  => watchdog_nmi_disable+0x49/0x70
  => watchdog_disable+0x4d/0x60
  => smpboot_thread_fn+0x138/0x280
  => kthread+0x152/0x190
  => ret_from_fork+0x2a/0x40
 watchdog-27      3.... 764917us : cpus_read_unlock+0x5/0xa0: unlock cpu
 watchdog-27      3d..1 764918us : lock_commit_crosslock+0xc1/0x5b0: xhlock=132

I found that the following chain is of issue:

Creating the per_cpu smpboot hotplug thread function:

	mutex_lock(&smpboot_threads_lock);
	
	tsk = kthread_create_on_cpu(smpboot_thread_fn, td, cpu,
				    ht->thread_comm);
	kthread_park(tsk);

		wait_for_completion(&kthread->parked);


In the mean time, we have that thread doing:

	kthread() {
		smpboot_thread_fn() {
			watchdog_disable() {
				watchdog_nmi_disable() {
					perf_event_release_kernel() {
						put_event() {
							_free_event() {
								event->destroy() {
								(hw_perf_event_destroy())
									x86_release_hardware() {
										release_ds_buffers() {
											get_online_cpus();
											put_online_cpus();
										}
									}
								}
							}
						}
					}
				}
			}
			kthread_parkme() {
				complete(&self->park);


Or the abridged version:

	creating a task under smpboot_threads_lock and then calling wait_for_completion()

	having a case where get_online_cpus() is taken before smpboot_threads_lock

	Then the task you are waiting for, calls get_online_cpus() before doing the complete.

Creates a possibility of a deadlock, because we can have:

	TASK1		TASK2		TASK3
	-----		-----		-----
  mutex_lock(&smpboot_threads_lock);
  wait_for_completion();
  /* blocked */

		    get_online_cpus();
		    mutex_lock(&smpboot_threads_lock);
		    /* blocked */

				    get_online_cpus();
				    /* block */

				    /* Still needs to do: */

				    put_online_cpus();

				    complete();

Sadly, this splat happens at boot up and prevents me from testing
my other locking I'm playing with.

-- Steve