[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cc0b6768-5722-2277-6e51-75baf3311dc5@nvidia.com>
Date: Mon, 7 Feb 2022 17:41:30 +0530
From: Vidya Sagar <vidyas@...dia.com>
To: <nitirawa@...eaurora.org>
CC: Keith Busch <kbusch@...nel.org>, <rafael.j.wysocki@...el.com>,
<hch@....de>, <bhelgaas@...gle.com>, <mmaddireddy@...dia.com>,
<kthota@...dia.com>, <sagar.tv@...il.com>,
<linux-pci@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: Query related to shutting down NVMe during system suspend
On 2/7/2022 4:27 PM, nitirawa@...eaurora.org wrote:
> External email: Use caution opening links or attachments
>
>
> On 2022-02-01 22:28, Vidya Sagar wrote:
>> Thanks for the super quick reply and I couldn't agree more.
>>
>> On 2/1/2022 10:00 PM, Keith Busch wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On Tue, Feb 01, 2022 at 09:52:28PM +0530, Vidya Sagar wrote:
>>>> Hi Rafael & Christoph,
>>>> My query is regarding the comment and the code that follows after it
>>>> at
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/nvme/host/pci.c?h=v5.17-rc2#n3243
>>>>
>>>> What I understood from it is that, there is an underlying assumption
>>>> that the power to the devices is not removed during the suspend call.
>>>> In the case of device-tree based platforms like Tegra194, power is
>>>> indeed removed to the devices during suspend-resume process. Hence,
>>>> the
>>>> NVMe devices need to be taken through the shutdown path irrespective
>>>> of
>>>> whether the ASPM states are enabled or not.
>>>> I would like to hear from you the best method to follow to achieve
>>>> this.
>>>
>>> Since platform makers can't converge on how to let a driver know what
>>> it's supposed to do, I suggest we default to the simple shutdown
>>> suspend
>>> all the time. We can add a module parameter to let a user request nvme
>>> power management if they really want it. No matter what we do here,
>>> someone is going to complain, but at least simple shutdown is safe...
>>>
>
> Hi Vidya,
>
> Are you planning to add module parameter based on above discussion. I
> see similar behaviour even with qualcomm platform.
>
> [ 119.994092] nvme nvme0: I/O 9 QID 0 timeout, reset controller
> [ 120.006612] PM: dpm_run_callback(): pci_pm_resume+0x0/0xe4 returns
> -16
> [ 120.013502] nvme 0001:01:00.0: PM: pci_pm_resume+0x0/0xe4 returned
> -16 after 60059958 usecs
> [ 120.022239] nvme 0001:01:00.0: PM: failed to resume async: error -16
Not really.
Keith Busch has already pushed a patch to fix it in a different way and
issue is resolved (on Tegra platforms) with that patch.
https://lore.kernel.org/all/20220201165006.3074615-1-kbusch@kernel.org/
is that patch.
Thanks & Regards,
Vidya Sagar
>
> Regards,
> Nitin
>
>
Powered by blists - more mailing lists