lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZvHUn1Q2R8FumZ20@skv.local>
Date: Mon, 23 Sep 2024 23:50:39 +0300
From: Andrey Skvortsov <andrej.skvortzov@...il.com>
To: Stuart Hayes <stuart.w.hayes@...il.com>
Cc: linux-kernel@...r.kernel.org,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	"Rafael J . Wysocki" <rafael@...nel.org>,
	Martin Belanger <Martin.Belanger@...l.com>,
	Oliver O'Halloran <oohall@...il.com>,
	Daniel Wagner <dwagner@...e.de>, Keith Busch <kbusch@...nel.org>,
	Lukas Wunner <lukas@...ner.de>, David Jeffery <djeffery@...hat.com>,
	Jeremy Allison <jallison@....com>, Jens Axboe <axboe@...com>,
	Christoph Hellwig <hch@....de>, Sagi Grimberg <sagi@...mberg.me>,
	linux-nvme@...ts.infradead.org
Subject: Re: [PATCH v8 3/4] driver core: shut down devices asynchronously

Hi Stuart,

On 24-08-22 15:28, Stuart Hayes wrote:
> Add code to allow asynchronous shutdown of devices, ensuring that each
> device is shut down before its parents & suppliers.
> 
> Only devices with drivers that have async_shutdown_enable enabled will be
> shut down asynchronously.
> 
> This can dramatically reduce system shutdown/reboot time on systems that
> have multiple devices that take many seconds to shut down (like certain
> NVMe drives). On one system tested, the shutdown time went from 11 minutes
> without this patch to 55 seconds with the patch.
> 
> Signed-off-by: Stuart Hayes <stuart.w.hayes@...il.com>
> Signed-off-by: David Jeffery <djeffery@...hat.com>
> ---
>  drivers/base/base.h           |  4 +++
>  drivers/base/core.c           | 54 ++++++++++++++++++++++++++++++++++-
>  include/linux/device/driver.h |  2 ++
>  3 files changed, 59 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/base.h b/drivers/base/base.h
> index 0b53593372d7..aa5a2bd3f2b8 100644
> --- a/drivers/base/base.h
> +++ b/drivers/base/base.h
> @@ -10,6 +10,7 @@
>   * shared outside of the drivers/base/ directory.

This change landed in linux-next and I have problem with shutdown on
ARM Allwinner A64 device. Device usually hangs at shutdown.
git bisect pointed to "driver core: shut down devices asynchronously"
as a first bad commit.

I've tried to debug the problem and this is what I see:

1) device 'mmc_host mmc0' processed in device_shutdown. For this device
async_schedule_domain is called (cookie 264, for example).

2) after that 'mmcblk mmc0:aaaa' is processed. For this device
async_schedule_domain is called (cookie 296, for example).

3) 'mmc_host mmc0' is parent of 'mmcblk mmc0:aaaa' and
parent->p->shutdown_after is updated from 263 to 296.

4) After sometime shutdown_one_device_async is called for 264
(mmc_host mmc0), but dev->p->shutdown_after was updated to 296 and the
code calls first async_synchronize_cookie_domain for 297.

264 can't finish, because it waits for 297. shutdown process can't continue.

The problem is always with a MMC host controller.

-- 
Best regards,
Andrey Skvortsov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ