[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cc0b6768-5722-2277-6e51-75baf3311dc5@nvidia.com>
Date:   Mon, 7 Feb 2022 17:41:30 +0530
From:   Vidya Sagar <vidyas@...dia.com>
To:     <nitirawa@...eaurora.org>
CC:     Keith Busch <kbusch@...nel.org>, <rafael.j.wysocki@...el.com>,
        <hch@....de>, <bhelgaas@...gle.com>, <mmaddireddy@...dia.com>,
        <kthota@...dia.com>, <sagar.tv@...il.com>,
        <linux-pci@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: Query related to shutting down NVMe during system suspend
On 2/7/2022 4:27 PM, nitirawa@...eaurora.org wrote:
> External email: Use caution opening links or attachments
> 
> 
> On 2022-02-01 22:28, Vidya Sagar wrote:
>> Thanks for the super quick reply and I couldn't agree more.
>>
>> On 2/1/2022 10:00 PM, Keith Busch wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On Tue, Feb 01, 2022 at 09:52:28PM +0530, Vidya Sagar wrote:
>>>> Hi Rafael & Christoph,
>>>> My query is regarding the comment and the code that follows after it
>>>> at
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/nvme/host/pci.c?h=v5.17-rc2#n3243 
>>>>
>>>> What I understood from it is that, there is an underlying assumption
>>>> that the power to the devices is not removed during the suspend call.
>>>> In the case of device-tree based platforms like Tegra194, power is
>>>> indeed removed to the devices during suspend-resume process. Hence,
>>>> the
>>>> NVMe devices need to be taken through the shutdown path irrespective
>>>> of
>>>> whether the ASPM states are enabled or not.
>>>> I would like to hear from you the best method to follow to achieve
>>>> this.
>>>
>>> Since platform makers can't converge on how to let a driver know what
>>> it's supposed to do, I suggest we default to the simple shutdown
>>> suspend
>>> all the time. We can add a module parameter to let a user request nvme
>>> power management if they really want it. No matter what we do here,
>>> someone is going to complain, but at least simple shutdown is safe...
>>>
> 
> Hi Vidya,
> 
> Are you planning to add module parameter based on above discussion. I
> see similar behaviour even with  qualcomm platform.
> 
> [  119.994092] nvme nvme0: I/O 9 QID 0 timeout, reset controller
> [  120.006612] PM: dpm_run_callback(): pci_pm_resume+0x0/0xe4 returns
> -16
> [  120.013502] nvme 0001:01:00.0: PM: pci_pm_resume+0x0/0xe4 returned
> -16 after 60059958 usecs
> [  120.022239] nvme 0001:01:00.0: PM: failed to resume async: error -16
Not really.
Keith Busch has already pushed a patch to fix it in a different way and 
issue is resolved (on Tegra platforms) with that patch.
https://lore.kernel.org/all/20220201165006.3074615-1-kbusch@kernel.org/ 
is that patch.
Thanks & Regards,
Vidya Sagar
> 
> Regards,
> Nitin
> 
> 
Powered by blists - more mailing lists
 
