[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3rsijaalogupaxilfr7rjrqrzqzxrk2sqnj2zeqhaideqwpani@w2ozjbxihkgq>
Date: Mon, 1 Sep 2025 09:08:36 +0530
From: Manivannan Sadhasivam <mani@...nel.org>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: manivannan.sadhasivam@....qualcomm.com,
Lorenzo Pieralisi <lpieralisi@...nel.org>, Krzysztof Wilczyński <kwilczynski@...nel.org>,
Rob Herring <robh@...nel.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
linux-arm-msm@...r.kernel.org, linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
"David E. Box" <david.e.box@...ux.intel.com>, Krishna Chaitanya Chundru <krishna.chundru@....qualcomm.com>,
Johan Hovold <johan@...nel.org>, stable+noautosel@...nel.org
Subject: Re: [PATCH v3] PCI: qcom: Use pci_host_set_default_pcie_link_state()
API to enable ASPM for all platforms
On Wed, Aug 27, 2025 at 01:22:58PM GMT, Bjorn Helgaas wrote:
> On Wed, Aug 27, 2025 at 10:43:26PM +0530, Manivannan Sadhasivam wrote:
> > On Wed, Aug 27, 2025 at 10:47:39AM GMT, Bjorn Helgaas wrote:
> > > On Mon, Aug 25, 2025 at 01:52:57PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> > > > From: Manivannan Sadhasivam <manivannan.sadhasivam@....qualcomm.com>
> > > >
> > > > Commit 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms
> > > > supporting 1.9.0 ops") allowed the Qcom controller driver to
> > > > enable ASPM for all PCI devices enumerated at the time of the
> > > > controller driver probe. It proved to be useful for devices
> > > > already powered on by the bootloader, as it allowed devices to
> > > > enter ASPM without user intervention.
> > > >
> > > > However, it could not enable ASPM for the hotplug capable
> > > > devices i.e., devices enumerated *after* the controller driver
> > > > probe. This limitation mostly went unnoticed as the Qcom PCI
> > > > controllers are not hotplug capable and also the bootloader has
> > > > been enabling the PCI devices before Linux Kernel boots (mostly
> > > > on the Qcom compute platforms which users use on a daily basis).
> > > >
> > > > But with the advent of the commit b458ff7e8176 ("PCI/pwrctl:
> > > > Ensure that pwrctl drivers are probed before PCI client
> > > > drivers"), the pwrctrl driver started to block the PCI device
> > > > enumeration until it had been probed. Though, the intention of
> > > > the commit was to avoid race between the pwrctrl driver and PCI
> > > > client driver, it also meant that the pwrctrl controlled PCI
> > > > devices may get probed after the controller driver and will no
> > > > longer have ASPM enabled. So users started noticing high runtime
> > > > power consumption with WLAN chipsets on Qcom compute platforms
> > > > like Thinkpad X13s, and Thinkpad T14s, etc...
> > > >
> > > > Obviously, it is the pwrctrl change that caused regression, but
> > > > it ultimately uncovered a flaw in the ASPM enablement logic of
> > > > the controller driver. So to address the actual issue, use the
> > > > newly introduced pci_host_set_default_pcie_link_state() API,
> > > > which allows the controller drivers to set the default ASPM
> > > > state for *all* devices. This default state will be honored by
> > > > the ASPM driver while enabling ASPM for all the devices.
> > >
> > > So I guess this fixes a power consumption regression that became
> > > visible with b458ff7e8176 ("PCI/pwrctl: Ensure that pwrctl drivers
> > > are probed before PCI client drivers"). Arguably we should have a
> > > Fixes: tag for that, and maybe even skip the tag for 9f4f3dfad8cf,
> > > since if you have 9f4f3dfad8cf but not b458ff7e8176, it doesn't
> > > sound like you would need to backport this change?
> >
> > 9f4f3dfad8cf is the culprit which added a half baked solution for
> > enabling ASPM and b458ff7e8176 made the issue more obvious since
> > instead of requiring people to connect a device after boot, it
> > allowed the device to enumerate after the probe of the controller
> > driver, thereby making it more visible.
> >
> > It would make sense to add a Fixes tag for b458ff7e8176 too, but
> > 9f4f3dfad8cf should also be present IMO.
>
> OK. IIUC we would only land on 9f4f3dfad8cf for hot-plugged devices,
> and you said these controllers don't support hotplug. Backporting
> something to fix a half-baked solution that doesn't cause an actual
> problem is of marginal value, but we can keep the Fixes: 9f4f3dfad8cf.
>
Just to add more: It is true that our controllers doesn't support hotplug, but
with the addition of pwrctrl, the enumeration of the devices could be deferred
for a while (until the pwrctrl driver gets probed and does its job even if the
device was powered on by the bootloader), then we will trip over this issue. So
we will need to backport it to 6.13 where commit b458ff7e8176 got introduced.
But I added the noautosel tag since the backport requires manual work.
> > > > Also with this change, ASPM is now enabled by default on all
> > > > Qcom platforms as I haven't heard of any reported issues (apart
> > > > from the unsupported L0s on some platorms, which is still
> > > > handled separately).
> > >
> > > If ASPM hasn't been enabled by default, how would you hear about
> > > issues? Is ASPM commonly enabled in some other way?
> >
> > Mostly from the downstream drivers. They do enable ASPM by default
> > and I haven't heard of any issues with that. So I assumed that would
> > mean it will be safe for us to enable ASPM for all platforms in
> > upstream as well.
> >
> > > If you think the risk of ASPM issues is nil, I guess it's OK to do
> > > at the same time. From a maintenance perspective it might be nice
> > > to separate that change so if there happened to be a regression,
> > > we could identify and revert that part by itself if necessary. It
> > > looks like previously, ASPM was only enabled for one part:
> > >
> > > ops_2_7_0 # cfg_2_7_0 qcom,pcie-sdm845
> >
> > No. Previously ASPM was enabled for ops_1_9_0 and ops_1_21_0 based
> > platforms.
>
> Oops, my mistake. Traversing all the ops_ and cfg_ structs was a
> little confusing. For posterity, I guess the corrected matrix is that
> ASPM was previously enabled for these:
>
> ops_1_9_0 # cfg_1_9_0 qcom,pcie-sc7280 qcom,pcie-sc8180x qcom,pcie-sdx55 qcom,pcie-sm8150 qcom,pcie-sm8250 qcom,pcie-sm8350 qcom,pcie-sm8450-pcie0 qcom,pcie-sm8450-pcie1 qcom,pcie-sm8550
> # cfg_1_34_0 qcom,pcie-sa8775p
> ops_1_21_0 # cfg_sc8280xp qcom,pcie-sa8540p qcom,pcie-sc8280xp qcom,pcie-x1e80100
>
> And will now be enabled for these:
>
> ops_2_1_0 # cfg_2_1_0 qcom,pcie-apq8064 qcom,pcie-ipq8064 qcom,pcie-ipq8064-v2
> ops_1_0_0 # cfg_1_0_0 qcom,pcie-apq8084
> ops_2_3_2 # cfg_2_3_2 qcom,pcie-msm8996
> ops_2_4_0 # cfg_2_4_0 qcom,pcie-ipq4019 qcom,pcie-qcs404
> ops_2_3_3 # cfg_2_3_3 qcom,pcie-ipq8074
> ops_2_7_0 # cfg_2_7_0 qcom,pcie-sdm845
> ops_2_9_0 # cfg_2_9_0 qcom,pcie-ipq5018 qcom,pcie-ipq6018 qcom,pcie-ipq8074-gen3 qcom,pcie-ipq9574
>
> > If you insist, I can do that by calling
> > pci_host_set_default_pcie_link_state() from qcom_pcie_init_2_7_0()
> > and later move it to qcom_pcie_host_init() in a separate patch.
>
> I don't insist; that's why I said "if you think the risk of issues
> is nil," since I didn't know that you had any experience with ASPM
> being enabled on the others. But it sounds like you do, so I'm ok
> with this as-is.
>
Ok. In any case, we need to sort out the discussion happening with the series
from David [1] first. Currently, ASPM is broken on couple of Snapdragon laptops
since v6.13 and I'd love to fix it asap.
- Mani
[1] https://lore.kernel.org/linux-pci/20250828204345.GA958461@bhelgaas
--
மணிவண்ணன் சதாசிவம்
Powered by blists - more mailing lists