lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 07 Feb 2022 21:14:49 +0530
From:   nitirawa@...eaurora.org
To:     Vidya Sagar <vidyas@...dia.com>, Keith Busch <kbusch@...nel.org>
Cc:     Keith Busch <kbusch@...nel.org>, rafael.j.wysocki@...el.com,
        hch@....de, bhelgaas@...gle.com, mmaddireddy@...dia.com,
        kthota@...dia.com, sagar.tv@...il.com, linux-pci@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: Query related to shutting down NVMe during system suspend

On 2022-02-07 17:41, Vidya Sagar wrote:
> On 2/7/2022 4:27 PM, nitirawa@...eaurora.org wrote:
>> External email: Use caution opening links or attachments
>> 
>> 
>> On 2022-02-01 22:28, Vidya Sagar wrote:
>>> Thanks for the super quick reply and I couldn't agree more.
>>> 
>>> On 2/1/2022 10:00 PM, Keith Busch wrote:
>>>> External email: Use caution opening links or attachments
>>>> 
>>>> 
>>>> On Tue, Feb 01, 2022 at 09:52:28PM +0530, Vidya Sagar wrote:
>>>>> Hi Rafael & Christoph,
>>>>> My query is regarding the comment and the code that follows after 
>>>>> it
>>>>> at
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/nvme/host/pci.c?h=v5.17-rc2#n3243 
>>>>> What I understood from it is that, there is an underlying 
>>>>> assumption
>>>>> that the power to the devices is not removed during the suspend 
>>>>> call.
>>>>> In the case of device-tree based platforms like Tegra194, power is
>>>>> indeed removed to the devices during suspend-resume process. Hence,
>>>>> the
>>>>> NVMe devices need to be taken through the shutdown path 
>>>>> irrespective
>>>>> of
>>>>> whether the ASPM states are enabled or not.
>>>>> I would like to hear from you the best method to follow to achieve
>>>>> this.
>>>> 
>>>> Since platform makers can't converge on how to let a driver know 
>>>> what
>>>> it's supposed to do, I suggest we default to the simple shutdown
>>>> suspend
>>>> all the time. We can add a module parameter to let a user request 
>>>> nvme
>>>> power management if they really want it. No matter what we do here,
>>>> someone is going to complain, but at least simple shutdown is 
>>>> safe...
>>>> 
>> 
>> Hi Vidya,
>> 
>> Are you planning to add module parameter based on above discussion. I
>> see similar behaviour even with  qualcomm platform.
>> 
>> [  119.994092] nvme nvme0: I/O 9 QID 0 timeout, reset controller
>> [  120.006612] PM: dpm_run_callback(): pci_pm_resume+0x0/0xe4 returns
>> -16
>> [  120.013502] nvme 0001:01:00.0: PM: pci_pm_resume+0x0/0xe4 returned
>> -16 after 60059958 usecs
>> [  120.022239] nvme 0001:01:00.0: PM: failed to resume async: error 
>> -16
> Not really.
> Keith Busch has already pushed a patch to fix it in a different way
> and issue is resolved (on Tegra platforms) with that patch.
> https://lore.kernel.org/all/20220201165006.3074615-1-kbusch@kernel.org/
> is that patch.
> 
> Thanks & Regards,
> Vidya Sagar
>> 
>> Regards,
>> Nitin
>> 
>> 

Thanks Vidya for pointing out the patch . This patch worked for us as 
well.
@keith - Please can we get this merged .

Regards,
Nitin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ