linux-kernel - Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states set by BIOS for devicetree platforms

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <2e0f12b3-59ad-4bf8-8416-5a7e0fff470c@nvidia.com>
Date: Fri, 23 Jan 2026 14:39:46 +0000
From: Jon Hunter <jonathanh@...dia.com>
To: Manivannan Sadhasivam <mani@...nel.org>
Cc: Bjorn Helgaas <helgaas@...nel.org>,
 manivannan.sadhasivam@....qualcomm.com, Bjorn Helgaas <bhelgaas@...gle.com>,
 Lorenzo Pieralisi <lpieralisi@...nel.org>,
 Krzysztof Wilczyński <kwilczynski@...nel.org>,
 Rob Herring <robh@...nel.org>, linux-pci@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-arm-msm@...r.kernel.org,
 "David E. Box" <david.e.box@...ux.intel.com>,
 Kai-Heng Feng <kai.heng.feng@...onical.com>,
 "Rafael J. Wysocki" <rafael@...nel.org>,
 Heiner Kallweit <hkallweit1@...il.com>,
 Chia-Lin Kao <acelan.kao@...onical.com>,
 "linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>,
 Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...nel.dk>,
 Christoph Hellwig <hch@....de>, Sagi Grimberg <sagi@...mberg.me>,
 linux-nvme@...ts.infradead.org, krishna.chundru@....qualcomm.com
Subject: Re: [PATCH v2 1/2] PCI/ASPM: Override the ASPM and Clock PM states
 set by BIOS for devicetree platforms


On 23/01/2026 13:56, Manivannan Sadhasivam wrote:
> + Krishna
> 
> On Fri, Jan 23, 2026 at 10:55:28AM +0000, Jon Hunter wrote:
>>
>> On 22/01/2026 19:14, Jon Hunter wrote:
>>
>> ...
>>
>>>> I think what is going on here is that since before commits
>>>> f3ac2ff14834 and
>>>> df5192d9bb0e, !pcie_aspm_enabled() check was passing as ASPM was not
>>>> enabled for
>>>> the device (and upstream port) and after those commits, this check is not
>>>> passing and the NVMe driver is not shutting down the controller and
>>>> expects the
>>>> link to be in L0/L1ss. But the Tegra controller driver initiates L2/L3
>>>> transition, and also turns off the device. So all the NVMe context
>>>> is lost
>>>> during suspend and while resuming, the NVMe driver got confused due
>>>> to lost
>>>> context.
>>>>
>>>> Jon, could you please try the below hack and see if it fixes the issue?
>>>>
>>>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>>>> index 0e4caeab739c..4b8d261117f5 100644
>>>> --- a/drivers/nvme/host/pci.c
>>>> +++ b/drivers/nvme/host/pci.c
>>>> @@ -3723,7 +3723,7 @@ static int nvme_suspend(struct device *dev)
>>>>            * state (which may not be possible if the link is up).
>>>>            */
>>>>           if (pm_suspend_via_firmware() || !ctrl->npss ||
>>>> -           !pcie_aspm_enabled(pdev) ||
>>>> +           pcie_aspm_enabled(pdev) ||
>>>>               (ndev->ctrl.quirks & NVME_QUIRK_SIMPLE_SUSPEND))
>>>>                   return nvme_disable_prepare_reset(ndev, true);
>>>> This will confirm whether the issue is due to Tegra controller
>>>> driver breaking
>>>> the NVMe driver assumption or not.
>>>
>>> Yes that appears to be working! I will test some more boards to confirm.
>>
>> So yes with the above all boards appear to be working fine.
>>
>> How is this usually coordinated between the NVMe driver and Host controller
>> driver? It is not clear to me exactly where the problem is and if the NVMe
>> is not shutting down, then what should be preventing the Host controller
>> from shutting down.
>>
> 
> Well if the NVMe driver is not shutting down the device, then it expects the
> device to be in APST (NVMe low power state if supported) state and retain all
> the context across the suspend/resume cycle.
> 
> But if the host controller powers down the device, then during resume, the
> device will start afresh and would've lost all the context (like queue info
> etc..). So when the NVMe driver resumes, it would expect the device to retain
> the context and try to use the device as such. But it won't work as the device
> will be in an unconfigured state and you'll see failures as you reported.
> 
> Apparently, most host controller drivers never cared about it because either
> they were not tested with NVMe or they haven't enabled ASPM before. So the NVMe
> driver ended up shutting down the controller during suspend. But since we
> started enabling ASPM by default since 6.18, this issue is being uncovered.
> 
> So to properly fix it, we need the controller drivers to perform below checks
> for all devices under the Root bus(ses) before initiating D3Cold:
> 
> 1. Check if the device state is D3Hot. If it is not D3Hot, then the device is
> expected to stay in the current D-state by the client driver, so D3Cold should
> not be initiated.
> 
> 2. Check if the device is wakeup capable. If it is, then check if it can support
> wakeup from D3Cold (with WAKE#).
> 
> Only if both conditions are satisfied for all the devices under the Root busses,
> then the host controller driver should initiate D3Cold sequence.
> 
> Krishna is going to post a patch that performs the above checks for
> pcie-designware-host driver. But since the above checks are platform agnostic,
> we should introduce a helper and resuse it across other controllers as well.
> 
> Hope this clarifies.

Yes it does. I am happy to test any patches for this.

Jon
-- 
nvpublic