linux-kernel - Re: [PATCH 1/2] bdi: Do not use freezable workqueue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0002b2f3-d17c-0d49-52f4-b2ce31832e6c@kernel.dk>
Date:   Fri, 4 Oct 2019 07:22:25 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Mika Westerberg <mika.westerberg@...ux.intel.com>,
        "Rafael J . Wysocki" <rjw@...ysocki.net>
Cc:     Pavel Machek <pavel@....cz>, Jan Kara <jack@...e.cz>,
        Tejun Heo <tj@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        AceLan Kao <acelan.kao@...onical.com>,
        linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org
Subject: Re: [PATCH 1/2] bdi: Do not use freezable workqueue

On 10/4/19 4:00 AM, Mika Westerberg wrote:
> A removable block device, such as NVMe or SSD connected over Thunderbolt
> can be hot-removed any time including when the system is suspended. When
> device is hot-removed during suspend and the system gets resumed, kernel
> first resumes devices and then thaws the userspace including freezable
> workqueues. What happens in that case is that the NVMe driver notices
> that the device is unplugged and removes it from the system. This ends
> up calling bdi_unregister() for the gendisk which then schedules
> wb_workfn() to be run one more time.
> 
> However, since the bdi_wq is still frozen flush_delayed_work() call in
> wb_shutdown() blocks forever halting system resume process. User sees
> this as hang as nothing is happening anymore.
> 
> Triggering sysrq-w reveals this:
> 
>    Workqueue: nvme-wq nvme_remove_dead_ctrl_work [nvme]
>    Call Trace:
>     ? __schedule+0x2c5/0x630
>     ? wait_for_completion+0xa4/0x120
>     schedule+0x3e/0xc0
>     schedule_timeout+0x1c9/0x320
>     ? resched_curr+0x1f/0xd0
>     ? wait_for_completion+0xa4/0x120
>     wait_for_completion+0xc3/0x120
>     ? wake_up_q+0x60/0x60
>     __flush_work+0x131/0x1e0
>     ? flush_workqueue_prep_pwqs+0x130/0x130
>     bdi_unregister+0xb9/0x130
>     del_gendisk+0x2d2/0x2e0
>     nvme_ns_remove+0xed/0x110 [nvme_core]
>     nvme_remove_namespaces+0x96/0xd0 [nvme_core]
>     nvme_remove+0x5b/0x160 [nvme]
>     pci_device_remove+0x36/0x90
>     device_release_driver_internal+0xdf/0x1c0
>     nvme_remove_dead_ctrl_work+0x14/0x30 [nvme]
>     process_one_work+0x1c2/0x3f0
>     worker_thread+0x48/0x3e0
>     kthread+0x100/0x140
>     ? current_work+0x30/0x30
>     ? kthread_park+0x80/0x80
>     ret_from_fork+0x35/0x40
> 
> This is not limited to NVMes so exactly same issue can be reproduced by
> hot-removing SSD (over Thunderbolt) while the system is suspended.
> 
> Prevent this from happening by removing WQ_FREEZABLE from bdi_wq.

This series looks good for me, I don't think there's a reason for
the workers to be marked freezable.

-- 
Jens Axboe