[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250714193214.GA2415073@bhelgaas>
Date: Mon, 14 Jul 2025 14:32:14 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: Hans Zhang <18255117159@....com>
Cc: Manivannan Sadhasivam <mani@...nel.org>,
Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
Krishna Chaitanya Chundru <krishna.chundru@....qualcomm.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Jingoo Han <jingoohan1@...il.com>,
Lorenzo Pieralisi <lpieralisi@...nel.org>,
Rob Herring <robh@...nel.org>, Jeff Johnson <jjohnson@...nel.org>,
Bartosz Golaszewski <brgl@...ev.pl>,
Krzysztof Wilczyński <kwilczynski@...nel.org>,
linux-pci@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
linux-arm-msm@...r.kernel.org, mhi@...ts.linux.dev,
linux-wireless@...r.kernel.org, ath11k@...ts.infradead.org,
qiang.yu@....qualcomm.com, quic_vbadigan@...cinc.com,
quic_vpernami@...cinc.com, quic_mrana@...cinc.com,
Jeff Johnson <jeff.johnson@....qualcomm.com>
Subject: Re: [PATCH v4 06/11] PCI/ASPM: Clear aspm_disable as part of
__pci_enable_link_state()
On Sun, Jul 13, 2025 at 12:05:18AM +0800, Hans Zhang wrote:
> On 2025/7/12 17:35, Manivannan Sadhasivam wrote:
> ...
> > > IMO the "someday" goal should be that we get rid of aspm_policy
> > > and enable all the available power saving states by default. We
> > > have sysfs knobs that administrators can use if necessary, and
> > > drivers or quirks can disable states if they need to work around
> > > hardware defects.
> >
> > Yeah, I think the default should be powersave and let the users
> > disable it for performance if they want.
>
> Perhaps I don't think so. At present, our company's testing team has
> tested quite a few NVMe SSDS. As far as I can remember, the SSDS
> from two companies have encountered problems and will hang directly
> when turned on. We have set CONFIG_PCIEASPM_POWERSAVE=y by default.
> When encountering SSDS from these two companies, we had to add
> "pcie_aspm.policy=default" in the cmdline, and then the boot worked
> normally. Currently, we do not have a PCIe protocol analyzer to
> analyze such issues. The current approach is to modify the cmdline.
> So I can't prove whether it's a problem with the Root Port of our
> SOC or the SSD device.
Have you reported these?
> Here I agree with Bjorn's statement that sometimes the EP is not
> necessarily very standard and there are no hardware issues.
> Personally, I think the default is default or performance. When
> users need to save power, they should then decide whether to
> configure it as powersave or powersupersave. Sometimes, if the EP
> device connected by the customer is perfect, they can turn it on to
> save power. But if the EP is not perfect, at least they will
> immediately know what caused the problem.
We should discover device defects as early as possible so we can add
quirks for them. Defaulting to ASPM being partly disabled means it
gets much less testing and users end up passing around "fixes" like
booting with "pcie_aspm.policy=default" or similar. I do not want
users to trip over a device that doesn't work and have to look for
workarounds on the web.
I also think it's somewhat irresponsible of us to consume more power
than necessary. But as Mani said, this would be a big change and
might have to be done with a BIOS date check or something to try to
avoid regressions.
Bjorn
Powered by blists - more mailing lists