[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250502150027.GA818097@bhelgaas>
Date: Fri, 2 May 2025 10:00:27 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: hans.zhang@...tech.com
Cc: kbusch@...nel.org, axboe@...nel.dk, hch@....de, sagi@...mberg.me,
manivannan.sadhasivam@...aro.org, linux-nvme@...ts.infradead.org,
linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org
Subject: Re: [PATCH] nvme-pci: Fix system hang when ASPM L1 is enabled during
suspend
On Fri, May 02, 2025 at 11:20:51AM +0800, hans.zhang@...tech.com wrote:
> From: Hans Zhang <hans.zhang@...tech.com>
>
> When PCIe ASPM L1 is enabled (CONFIG_PCIEASPM_POWERSAVE=y), certain
CONFIG_PCIEASPM_POWERSAVE=y only sets the default. L1 can be enabled
dynamically regardless of the config.
> NVMe controllers fail to release LPI MSI-X interrupts during system
> suspend, leading to a system hang. This occurs because the driver's
> existing power management path does not fully disable the device
> when ASPM is active.
I have no idea what this has to do with ASPM L1. I do see that
nvme_suspend() tests pcie_aspm_enabled(pdev) (which seems kind of
janky and racy). But this doesn't explain anything about what would
cause a system hang.
> The fix adds an explicit device disable and reset preparation step
> in the suspend path after successfully setting the power state.
> This ensures proper cleanup of interrupt resources even when ASPM
> L1 is enabled, preventing the system from hanging during suspend.
Maybe there's a clue in the 600 lines of debug output that I trimmed,
but without some interpretation, I have no idea how to find it.
Unless you see similar problems on other systems, I would suspect an
issue with the SoC or the SoC driver where you do see problems.
Bjorn
Powered by blists - more mailing lists