[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0g3cCYK3rAQn09pCr7LMrRr=zQy_ceaEB5AKhVx604YgA@mail.gmail.com>
Date: Tue, 14 May 2019 10:04:22 +0200
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Keith Busch <kbusch@...nel.org>
Cc: Mario Limonciello <Mario.Limonciello@...l.com>,
Christoph Hellwig <hch@....de>,
Keith Busch <keith.busch@...el.com>,
Sagi Grimberg <sagi@...mberg.me>,
linux-nvme <linux-nvme@...ts.infradead.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux PM <linux-pm@...r.kernel.org>,
Kai-Heng Feng <kai.heng.feng@...onical.com>
Subject: Re: [PATCH] nvme/pci: Use host managed power state for suspend
On Mon, May 13, 2019 at 5:10 PM Keith Busch <kbusch@...nel.org> wrote:
>
> On Mon, May 13, 2019 at 03:05:42PM +0000, Mario.Limonciello@...l.com wrote:
> > This system power state - suspend to idle is going to freeze threads.
> > But we're talking a multi threaded kernel. Can't there be a timing problem going
> > on then too? With a disk flush being active in one task and the other task trying
> > to put the disk into the deepest power state. If you don't freeze the queues how
> > can you guarantee that didn't happen?
>
> But if an active data flush task is running, then we're not idle and
> shouldn't go to low power.
To be entirely precise, system suspend prevents user space from
running while it is in progress. It doesn't do that to kernel
threads, at least not by default, though, so if there is a kernel
thread flushing the data, it needs to be stopped or suspended somehow
directly in the system suspend path. [And yes, system suspend (or
hibernation) may take place at any time so long as all user space can
be prevented from running then (by means of the tasks freezer).]
However, freezing the queues from a driver ->suspend callback doesn't
help in general and the reason why is hibernation. Roughly speaking,
hibernation works in two steps, the first of which creates a snapshot
image of system memory and the second one writes that image to
persistent storage. Devices are resumed between the two steps in
order to make it possible to do the write, but that would unfreeze the
queues and let the data flusher run. If it runs, it may cause the
memory snapshot image that has just been created to become outdated
and restoring the system memory contents from that image going forward
may cause corruption to occur.
Thus freezing the queues from a driver ->suspend callback should not
be relied on for correctness if the same callback is used for system
suspend and hibernation, which is the case here. If doing that
prevents the system from crashing, it is critical to find out why IMO,
as that may very well indicate a broader issue, not necessarily in the
driver itself.
But note that even if the device turns out to behave oddly, it still
needs to be handled, unless it may be prevented from shipping to users
in that shape. If it ships, users will face the odd behavior anyway.
Powered by blists - more mailing lists