[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c2255367927729ee00c42ae4148c1301@codeaurora.org>
Date: Mon, 07 Feb 2022 21:14:49 +0530
From: nitirawa@...eaurora.org
To: Vidya Sagar <vidyas@...dia.com>, Keith Busch <kbusch@...nel.org>
Cc: Keith Busch <kbusch@...nel.org>, rafael.j.wysocki@...el.com,
hch@....de, bhelgaas@...gle.com, mmaddireddy@...dia.com,
kthota@...dia.com, sagar.tv@...il.com, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: Query related to shutting down NVMe during system suspend
On 2022-02-07 17:41, Vidya Sagar wrote:
> On 2/7/2022 4:27 PM, nitirawa@...eaurora.org wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 2022-02-01 22:28, Vidya Sagar wrote:
>>> Thanks for the super quick reply and I couldn't agree more.
>>>
>>> On 2/1/2022 10:00 PM, Keith Busch wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> On Tue, Feb 01, 2022 at 09:52:28PM +0530, Vidya Sagar wrote:
>>>>> Hi Rafael & Christoph,
>>>>> My query is regarding the comment and the code that follows after
>>>>> it
>>>>> at
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/nvme/host/pci.c?h=v5.17-rc2#n3243
>>>>> What I understood from it is that, there is an underlying
>>>>> assumption
>>>>> that the power to the devices is not removed during the suspend
>>>>> call.
>>>>> In the case of device-tree based platforms like Tegra194, power is
>>>>> indeed removed to the devices during suspend-resume process. Hence,
>>>>> the
>>>>> NVMe devices need to be taken through the shutdown path
>>>>> irrespective
>>>>> of
>>>>> whether the ASPM states are enabled or not.
>>>>> I would like to hear from you the best method to follow to achieve
>>>>> this.
>>>>
>>>> Since platform makers can't converge on how to let a driver know
>>>> what
>>>> it's supposed to do, I suggest we default to the simple shutdown
>>>> suspend
>>>> all the time. We can add a module parameter to let a user request
>>>> nvme
>>>> power management if they really want it. No matter what we do here,
>>>> someone is going to complain, but at least simple shutdown is
>>>> safe...
>>>>
>>
>> Hi Vidya,
>>
>> Are you planning to add module parameter based on above discussion. I
>> see similar behaviour even with qualcomm platform.
>>
>> [ 119.994092] nvme nvme0: I/O 9 QID 0 timeout, reset controller
>> [ 120.006612] PM: dpm_run_callback(): pci_pm_resume+0x0/0xe4 returns
>> -16
>> [ 120.013502] nvme 0001:01:00.0: PM: pci_pm_resume+0x0/0xe4 returned
>> -16 after 60059958 usecs
>> [ 120.022239] nvme 0001:01:00.0: PM: failed to resume async: error
>> -16
> Not really.
> Keith Busch has already pushed a patch to fix it in a different way
> and issue is resolved (on Tegra platforms) with that patch.
> https://lore.kernel.org/all/20220201165006.3074615-1-kbusch@kernel.org/
> is that patch.
>
> Thanks & Regards,
> Vidya Sagar
>>
>> Regards,
>> Nitin
>>
>>
Thanks Vidya for pointing out the patch . This patch worked for us as
well.
@keith - Please can we get this merged .
Regards,
Nitin
Powered by blists - more mailing lists