linux-kernel - Re: [PATCH v2] block/mq: fix potential deadlock during cpu hotplug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Sun, 7 May 2017 19:50:58 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Wanpeng Li <kernellwp@...il.com>, linux-kernel@...r.kernel.org
Cc:     Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Wanpeng Li <wanpeng.li@...mail.com>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v2] block/mq: fix potential deadlock during cpu hotplug

On 05/07/2017 01:14 AM, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@...mail.com>
> 
> This can be triggered by hot-unplug one cpu.
> 
> ======================================================
>  [ INFO: possible circular locking dependency detected ]
>  4.11.0+ #17 Not tainted
>  -------------------------------------------------------
>  step_after_susp/2640 is trying to acquire lock:
>   (all_q_mutex){+.+...}, at: [<ffffffffb33f95b8>] blk_mq_queue_reinit_work+0x18/0x110
> 
>  but task is already holding lock:
>   (cpu_hotplug.lock){+.+.+.}, at: [<ffffffffb306d04f>] cpu_hotplug_begin+0x7f/0xe0
> 
>  which lock already depends on the new lock.
> 
> 
>  the existing dependency chain (in reverse order) is:
> 
>  -> #1 (cpu_hotplug.lock){+.+.+.}:
>         lock_acquire+0x11c/0x230
>         __mutex_lock+0x92/0x990
>         mutex_lock_nested+0x1b/0x20
>         get_online_cpus+0x64/0x80
>         blk_mq_init_allocated_queue+0x3a0/0x4e0
>         blk_mq_init_queue+0x3a/0x60
>         loop_add+0xe5/0x280
>         loop_init+0x124/0x177
>         do_one_initcall+0x53/0x1c0
>         kernel_init_freeable+0x1e3/0x27f
>         kernel_init+0xe/0x100
>         ret_from_fork+0x31/0x40
> 
>  -> #0 (all_q_mutex){+.+...}:
>         __lock_acquire+0x189a/0x18a0
>         lock_acquire+0x11c/0x230
>         __mutex_lock+0x92/0x990
>         mutex_lock_nested+0x1b/0x20
>         blk_mq_queue_reinit_work+0x18/0x110
>         blk_mq_queue_reinit_dead+0x1c/0x20
>         cpuhp_invoke_callback+0x1f2/0x810
>         cpuhp_down_callbacks+0x42/0x80
>         _cpu_down+0xb2/0xe0
>         freeze_secondary_cpus+0xb6/0x390
>         suspend_devices_and_enter+0x3b3/0xa40
>         pm_suspend+0x129/0x490
>         state_store+0x82/0xf0
>         kobj_attr_store+0xf/0x20
>         sysfs_kf_write+0x45/0x60
>         kernfs_fop_write+0x135/0x1c0
>         __vfs_write+0x37/0x160
>         vfs_write+0xcd/0x1d0
>         SyS_write+0x58/0xc0
>         do_syscall_64+0x8f/0x710
>         return_from_SYSCALL_64+0x0/0x7a
> 
>  other info that might help us debug this:
> 
>   Possible unsafe locking scenario:
> 
>         CPU0                    CPU1
>         ----                    ----
>    lock(cpu_hotplug.lock);
>                                 lock(all_q_mutex);
>                                 lock(cpu_hotplug.lock);
>    lock(all_q_mutex);
> 
>   *** DEADLOCK ***
> 
>  8 locks held by step_after_susp/2640:
>   #0:  (sb_writers#6){.+.+.+}, at: [<ffffffffb3244aed>] vfs_write+0x1ad/0x1d0
>   #1:  (&of->mutex){+.+.+.}, at: [<ffffffffb32d3a51>] kernfs_fop_write+0x101/0x1c0
>   #2:  (s_active#166){.+.+.+}, at: [<ffffffffb32d3a59>] kernfs_fop_write+0x109/0x1c0
>   #3:  (pm_mutex){+.+...}, at: [<ffffffffb30d2ecd>] pm_suspend+0x21d/0x490
>   #4:  (acpi_scan_lock){+.+.+.}, at: [<ffffffffb34dc3d7>] acpi_scan_lock_acquire+0x17/0x20
>   #5:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffffb306d6d7>] freeze_secondary_cpus+0x27/0x390
>   #6:  (cpu_hotplug.dep_map){++++++}, at: [<ffffffffb306cfd5>] cpu_hotplug_begin+0x5/0xe0
>   #7:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffffb306d04f>] cpu_hotplug_begin+0x7f/0xe0
> 
>  stack backtrace:
>  CPU: 3 PID: 2640 Comm: step_after_susp Not tainted 4.11.0+ #17
>  Hardware name: Dell Inc. OptiPlex 7040/0JCTF8, BIOS 1.4.9 09/12/2016
>  Call Trace:
>   dump_stack+0x99/0xce
>   print_circular_bug+0x1fa/0x270
>   __lock_acquire+0x189a/0x18a0
>   lock_acquire+0x11c/0x230
>   ? lock_acquire+0x11c/0x230
>   ? blk_mq_queue_reinit_work+0x18/0x110
>   ? blk_mq_queue_reinit_work+0x18/0x110
>   __mutex_lock+0x92/0x990
>   ? blk_mq_queue_reinit_work+0x18/0x110
>   ? kmem_cache_free+0x2cb/0x330
>   ? anon_transport_class_unregister+0x20/0x20
>   ? blk_mq_queue_reinit_work+0x110/0x110
>   mutex_lock_nested+0x1b/0x20
>   ? mutex_lock_nested+0x1b/0x20
>   blk_mq_queue_reinit_work+0x18/0x110
>   blk_mq_queue_reinit_dead+0x1c/0x20
>   cpuhp_invoke_callback+0x1f2/0x810
>   ? __flow_cache_shrink+0x160/0x160
>   cpuhp_down_callbacks+0x42/0x80
>   _cpu_down+0xb2/0xe0
>   freeze_secondary_cpus+0xb6/0x390
>   suspend_devices_and_enter+0x3b3/0xa40
>   ? rcu_read_lock_sched_held+0x79/0x80
>   pm_suspend+0x129/0x490
>   state_store+0x82/0xf0
>   kobj_attr_store+0xf/0x20
>   sysfs_kf_write+0x45/0x60
>   kernfs_fop_write+0x135/0x1c0
>   __vfs_write+0x37/0x160
>   ? rcu_read_lock_sched_held+0x79/0x80
>   ? rcu_sync_lockdep_assert+0x2f/0x60
>   ? __sb_start_write+0xd9/0x1c0
>   ? vfs_write+0x1ad/0x1d0
>   vfs_write+0xcd/0x1d0
>   SyS_write+0x58/0xc0
>   ? rcu_read_lock_sched_held+0x79/0x80
>   do_syscall_64+0x8f/0x710
>   ? trace_hardirqs_on_thunk+0x1a/0x1c
>   entry_SYSCALL64_slow_path+0x25/0x25
> 
> The cpu hotplug path will hold cpu_hotplug.lock and then reinit all exiting 
> queues for blk mq w/ all_q_mutex, however, blk_mq_init_allocated_queue() will 
> contend these two locks in the inversion order. This is due to commit eabe06595d62
> (blk/mq: Cure cpu hotplug lock inversion), it fixes a cpu hotplug lock inversion 
> issue because of hotplug rework, however the hotplug rework is still work-in-progress 
> and lives in a -tip branch and mainline cannot yet trigger that splat. The commit 
> breaks the linus's tree in the merge window, so this patch reverts the lock order 
> and avoids to splat linus's tree.

Gah, so it was actively harmful before the hotplug rework got in... I've
added this for pre-rc1.

-- 
Jens Axboe