linux-kernel - hotplug lockdep splat (tip-rt)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1504350596.16793.44.camel@gmx.de>
Date:   Sat, 02 Sep 2017 13:09:56 +0200
From:   Mike Galbraith <efault@....de>
To:     LKML <linux-kernel@...r.kernel.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>
Subject: hotplug lockdep splat (tip-rt)

4.11-rt rolled forward, sprinkle liberally with seasoning of choice.

[ 7514.772861] ======================================================
[ 7514.772862] WARNING: possible circular locking dependency detected
[ 7514.772863] 4.13.0.g06260ca-rt11-tip-lockdep #20 Tainted: G            E  
[ 7514.772863] ------------------------------------------------------
[ 7514.772867] stress-cpu-hotp/4102 is trying to acquire lock:
[ 7514.772867]  ((complete)&st->done){+.+.}, at: [<ffffffff8107208a>] takedown_cpu+0x9a/0x120
[ 7514.772877] 
[ 7514.772877] but task is already holding lock:
[ 7514.772877]  (sparse_irq_lock){+.+.}, at: [<ffffffff8107203a>] takedown_cpu+0x4a/0x120
[ 7514.772879] 
[ 7514.772879] which lock already depends on the new lock.
[ 7514.772879] 
[ 7514.772879] 
[ 7514.772879] the existing dependency chain (in reverse order) is:
[ 7514.772880] 
[ 7514.772880] -> #2 (sparse_irq_lock){+.+.}:
[ 7514.772889]        lock_acquire+0xbd/0x250
[ 7514.772908]        _mutex_lock+0x31/0x50
[ 7514.772913]        irq_affinity_online_cpu+0x13/0xc0
[ 7514.772914]        cpuhp_invoke_callback+0x24c/0x9c0
[ 7514.772914]        cpuhp_up_callbacks+0x30/0xb0
[ 7514.772915]        cpuhp_thread_fun+0x159/0x170
[ 7514.772918]        smpboot_thread_fn+0x268/0x310
[ 7514.772919]        kthread+0x145/0x180
[ 7514.772921]        ret_from_fork+0x2a/0x40
[ 7514.772922] 
[ 7514.772922] -> #1 (cpuhp_state){+.+.}:
[ 7514.772926]        smpboot_thread_fn+0x268/0x310
[ 7514.772927]        kthread+0x145/0x180
[ 7514.772928]        ret_from_fork+0x2a/0x40
[ 7514.772930]        0xffffffffffffffff
[ 7514.772930] 
[ 7514.772930] -> #0 ((complete)&st->done){+.+.}:
[ 7514.772932]        __lock_acquire+0x113b/0x1190
[ 7514.772933]        lock_acquire+0xbd/0x250
[ 7514.772934]        wait_for_completion+0x51/0x120
[ 7514.772935]        takedown_cpu+0x9a/0x120
[ 7514.772936]        cpuhp_invoke_callback+0x24c/0x9c0
[ 7514.772937]        cpuhp_down_callbacks+0x3b/0x80
[ 7514.772939]        _cpu_down+0xba/0xf0
[ 7514.772940]        do_cpu_down+0x35/0x50
[ 7514.772949]        device_offline+0x7d/0xa0
[ 7514.772950]        online_store+0x3a/0x70
[ 7514.772959]        kernfs_fop_write+0x10a/0x190
[ 7514.772962]        __vfs_write+0x23/0x150
[ 7514.772963]        vfs_write+0xc2/0x1c0
[ 7514.772964]        SyS_write+0x45/0xa0
[ 7514.772965]        entry_SYSCALL_64_fastpath+0x1f/0xbe
[ 7514.772966] 
[ 7514.772966] other info that might help us debug this:
[ 7514.772966] 
[ 7514.772966] Chain exists of:
[ 7514.772966]   (complete)&st->done --> cpuhp_state --> sparse_irq_lock
[ 7514.772966] 
[ 7514.772968]  Possible unsafe locking scenario:
[ 7514.772968] 
[ 7514.772968]        CPU0                    CPU1
[ 7514.772968]        ----                    ----
[ 7514.772968]   lock(sparse_irq_lock);
[ 7514.772969]                                lock(cpuhp_state);
[ 7514.772970]                                lock(sparse_irq_lock);
[ 7514.772970]   lock((complete)&st->done);
[ 7514.772971] 
[ 7514.772971]  *** DEADLOCK ***
[ 7514.772971] 
[ 7514.772972] 8 locks held by stress-cpu-hotp/4102:
[ 7514.772972]  #0:  (sb_writers#4){.+.+}, at: [<ffffffff8126c410>] vfs_write+0x190/0x1c0
[ 7514.772974]  #1:  (&of->mutex){+.+.}, at: [<ffffffff8130187a>] kernfs_fop_write+0xda/0x190
[ 7514.772976]  #2:  (s_active#140){.+.+}, at: [<ffffffff81301882>] kernfs_fop_write+0xe2/0x190
[ 7514.772979]  #3:  (device_hotplug_lock){+.+.}, at: [<ffffffff8153aa71>] lock_device_hotplug_sysfs+0x11/0x40
[ 7514.772981]  #4:  (&dev->mutex){....}, at: [<ffffffff8153c20f>] device_offline+0x3f/0xa0
[ 7514.772983]  #5:  (cpu_add_remove_lock){+.+.}, at: [<ffffffff8107352f>] do_cpu_down+0x1f/0x50
[ 7514.772985]  #6:  (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff810d2101>] percpu_down_write+0x21/0x110
[ 7514.772987]  #7:  (sparse_irq_lock){+.+.}, at: [<ffffffff8107203a>] takedown_cpu+0x4a/0x120
[ 7514.772989] 
[ 7514.772989] stack backtrace:
[ 7514.772990] CPU: 5 PID: 4102 Comm: stress-cpu-hotp Tainted: G            E   4.13.0.g06260ca-rt11-tip-lockdep #20
[ 7514.772991] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
[ 7514.772992] Call Trace:
[ 7514.772995]  dump_stack+0x7c/0xbf
[ 7514.772997]  print_circular_bug+0x2d3/0x2e0
[ 7514.772999]  ? copy_trace+0xb0/0xb0
[ 7514.773001]  check_prev_add+0x666/0x700
[ 7514.773002]  ? copy_trace+0xb0/0xb0
[ 7514.773008]  ? __stop_cpus+0x51/0x70
[ 7514.773010]  ? copy_trace+0xb0/0xb0
[ 7514.773011]  __lock_acquire+0x113b/0x1190
[ 7514.773013]  ? trace_hardirqs_on_caller+0xf2/0x1a0
[ 7514.773015]  lock_acquire+0xbd/0x250
[ 7514.773018]  ? takedown_cpu+0x9a/0x120
[ 7514.773020]  wait_for_completion+0x51/0x120
[ 7514.773021]  ? takedown_cpu+0x9a/0x120
[ 7514.773022]  ? cpuhp_invoke_callback+0x9c0/0x9c0
[ 7514.773023]  takedown_cpu+0x9a/0x120
[ 7514.773025]  ? cpuhp_complete_idle_dead+0x10/0x10
[ 7514.773026]  cpuhp_invoke_callback+0x24c/0x9c0
[ 7514.773028]  cpuhp_down_callbacks+0x3b/0x80
[ 7514.773030]  _cpu_down+0xba/0xf0
[ 7514.773031]  do_cpu_down+0x35/0x50
[ 7514.773033]  device_offline+0x7d/0xa0
[ 7514.773034]  online_store+0x3a/0x70
[ 7514.773036]  kernfs_fop_write+0x10a/0x190
[ 7514.773037]  __vfs_write+0x23/0x150
[ 7514.773039]  ? rcu_read_lock_sched_held+0x9b/0xb0
[ 7514.773043]  ? rcu_sync_lockdep_assert+0x2d/0x60
[ 7514.773045]  ? __sb_start_write+0x190/0x240
[ 7514.773046]  ? vfs_write+0x190/0x1c0
[ 7514.773048]  vfs_write+0xc2/0x1c0
[ 7514.773050]  SyS_write+0x45/0xa0
[ 7514.773051]  entry_SYSCALL_64_fastpath+0x1f/0xbe
[ 7514.773053] RIP: 0033:0x7fc5e51fd2d0
[ 7514.773053] RSP: 002b:00007ffd73d17678 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 7514.773054] RAX: ffffffffffffffda RBX: 00007fc5e54bd678 RCX: 00007fc5e51fd2d0
[ 7514.773055] RDX: 0000000000000002 RSI: 00007fc5e5d64000 RDI: 0000000000000001
[ 7514.773056] RBP: 00007fc5e54bd620 R08: 000000000000000a R09: 00007fc5e5d16700
[ 7514.773056] R10: 000000000198bc50 R11: 0000000000000246 R12: 0000000000000110
[ 7514.773057] R13: 00000000000000e4 R14: 0000000000002710 R15: 00000000000000f1