lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <SN6PR02MB41572B532608AD12685ED678D446A@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Mon, 30 Jun 2025 20:33:26 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Stuart Hayes <stuart.w.hayes@...il.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, Greg Kroah-Hartman
	<gregkh@...uxfoundation.org>, "Rafael J . Wysocki" <rafael@...nel.org>,
	Martin Belanger <Martin.Belanger@...l.com>, Oliver O'Halloran
	<oohall@...il.com>, Daniel Wagner <dwagner@...e.de>, Keith Busch
	<kbusch@...nel.org>, Lukas Wunner <lukas@...ner.de>, David Jeffery
	<djeffery@...hat.com>, Jeremy Allison <jallison@....com>, Jens Axboe
	<axboe@...com>, Christoph Hellwig <hch@....de>, Sagi Grimberg
	<sagi@...mberg.me>, "linux-nvme@...ts.infradead.org"
	<linux-nvme@...ts.infradead.org>, Nathan Chancellor <nathan@...nel.org>, Jan
 Kiszka <jan.kiszka@...mens.com>, Bert Karwatzki <spasswolf@....de>
Subject: RE: [PATCH v10 0/5] shut down devices asynchronously

From: Stuart Hayes <stuart.w.hayes@...il.com> Sent: Wednesday, June 25, 2025 1:19 PM
> 
> This adds the ability for the kernel to shutdown devices asynchronously.
> 
> Only devices with drivers that enable it are shut down asynchronously.
> 
> This can dramatically reduce system shutdown/reboot time on systems that
> have multiple devices that take many seconds to shut down (like certain
> NVMe drives). On one system tested, the shutdown time went from 11 minutes
> without this patch to 55 seconds with the patch.

I've tested this version and all looks good. I did the same tests that I did
with v9 [1], running in a VM in the Azure cloud. The 2 NVMe devices are
shutdown in parallel, gaining about 110 milliseconds, and there were no
slowdowns as seen in v9. The net gain was ~100 ms.

I also tested a local Hyper-V VM that does not have any NVMe devices.
The shutdown timings with and without this patch set are pretty much
the same, which was not the case with v9.

I did not repeat the more detailed debugging from v9 as reported
here [2], since there is no unexpected slowness with v10.

For the series,

Tested-by: Michael Kelley <mhklinux@...look.com>

[1] https://lore.kernel.org/lkml/BN7PR02MB41480DE777B9C224F3C2DF43D4792@BN7PR02MB4148.namprd02.prod.outlook.com/
[2] https://lore.kernel.org/lkml/SN6PR02MB41571E2DD410D09CE7494B38D4402@SN6PR02MB4157.namprd02.prod.outlook.com/

> 
> Changes from V9:
> 
> Address resource and timing issues when spawning a unique async thread
> for every device during shutdown:
>   * Make the asynchronous threads able to shut down multiple devices,
>     instead of spawning a unique thread for every device.
>   * Modify core kernel async code with a custom wake function so it
>     doesn't wake up threads waiting to synchronize every time the cookie
>     changes
> 
> Changes from V8:
> 
> Deal with shutdown hangs resulting when a parent/supplier device is
>   later in the devices_kset list than its children/consumers:
>   * Ignore sync_state_only devlinks for shutdown dependencies
>   * Ignore shutdown_after for devices that don't want async shutdown
>   * Add a sanity check to revert to sync shutdown for any device that
>     would otherwise wait for a child/consumer shutdown that hasn't
>     already been scheduled
> 
> Changes from V7:
> 
> Do not expose driver async_shutdown_enable in sysfs.
> Wrapped a long line.
> 
> Changes from V6:
> 
> Removed a sysfs attribute that allowed the async device shutdown to be
> "on" (with driver opt-out), "safe" (driver opt-in), or "off"... what was
> previously "safe" is now the only behavior, so drivers now only need to
> have the option to enable or disable async shutdown.
> 
> Changes from V5:
> 
> Separated into multiple patches to make review easier.
> Reworked some code to make it more readable
> Made devices wait for consumers to shut down, not just children
>   (suggested by David Jeffery)
> 
> Changes from V4:
> 
> Change code to use cookies for synchronization rather than async domains
> Allow async shutdown to be disabled via sysfs, and allow driver opt-in or
>   opt-out of async shutdown (when not disabled), with ability to control
>   driver opt-in/opt-out via sysfs
> 
> Changes from V3:
> 
> Bug fix (used "parent" not "dev->parent" in device_shutdown)
> 
> Changes from V2:
> 
> Removed recursive functions to schedule children to be shutdown before
>   parents, since existing device_shutdown loop will already do this
> 
> Changes from V1:
> 
> Rewritten using kernel async code (suggested by Lukas Wunner)
> 
> David Jeffery (1):
>   kernel/async: streamline cookie synchronization
> 
> Stuart Hayes (4):
>   driver core: don't always lock parent in shutdown
>   driver core: separate function to shutdown one device
>   driver core: shut down devices asynchronously
>   nvme-pci: Make driver prefer asynchronous shutdown
> 
>  drivers/base/base.h           |   8 ++
>  drivers/base/core.c           | 210 +++++++++++++++++++++++++++++-----
>  drivers/nvme/host/pci.c       |   1 +
>  include/linux/device/driver.h |   2 +
>  kernel/async.c                |  42 ++++++-
>  5 files changed, 236 insertions(+), 27 deletions(-)
> 
> --
> 2.39.3
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ