[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251027230703.GA1485546@bhelgaas>
Date: Mon, 27 Oct 2025 18:07:03 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: Manivannan Sadhasivam <mani@...nel.org>
Cc: linux-pci@...r.kernel.org,
Manivannan Sadhasivam <manivannan.sadhasivam@....qualcomm.com>,
Krzysztof WilczyĆski <kwilczynski@...nel.org>,
Johan Hovold <johan@...nel.org>, Frank Li <Frank.li@....com>,
Shawn Lin <shawn.lin@...k-chips.com>, Rob Herring <robh@...nel.org>,
"David E . Box" <david.e.box@...ux.intel.com>,
Kai-Heng Feng <kai.heng.feng@...onical.com>,
"Rafael J . Wysocki" <rafael@...nel.org>,
Heiner Kallweit <hkallweit1@...il.com>,
Chia-Lin Kao <acelan.kao@...onical.com>,
Gustavo Pimentel <Gustavo.Pimentel@...opsys.com>,
Han Jingoo <jingoohan1@...il.com>,
Bjorn Andersson <andersson@...nel.org>,
Konrad Dybcio <konrad.dybcio@....qualcomm.com>,
Bartosz Golaszewski <brgl@...ev.pl>, linux-arm-msm@...r.kernel.org,
linux-kernel@...r.kernel.org, Bjorn Helgaas <bhelgaas@...gle.com>
Subject: Re: [PATCH] Revert "PCI: qcom: Remove custom ASPM enablement code"
On Mon, Oct 27, 2025 at 05:21:30PM +0530, Manivannan Sadhasivam wrote:
> On Sun, Oct 26, 2025 at 02:37:54PM -0500, Bjorn Helgaas wrote:
> > On Sun, Oct 26, 2025 at 08:58:29PM +0530, Manivannan Sadhasivam wrote:
> > > On Fri, Oct 24, 2025 at 04:04:57PM -0500, Bjorn Helgaas wrote:
> > > > From: Bjorn Helgaas <bhelgaas@...gle.com>
> > > >
> > > > This reverts commit a729c16646198872e345bf6c48dbe540ad8a9753.
> > > >
> > > > Prior to a729c1664619 ("PCI: qcom: Remove custom ASPM enablement code"),
> > > > the qcom controller driver enabled ASPM, including L0s, L1, and L1 PM
> > > > Substates, for all devices powered on at the time the controller driver
> > > > enumerates them.
> > > >
> > > > ASPM was *not* enabled for devices powered on later by pwrctrl (unless the
> > > > kernel was built with PCIEASPM_POWERSAVE or PCIEASPM_POWER_SUPERSAVE, or
> > > > the user enabled ASPM via module parameter or sysfs).
> > > >
> > > > After f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for
> > > > devicetree platforms"), the PCI core enabled all ASPM states for all
> > > > devices whether powered on initially or by pwrctrl, so a729c1664619 was
> > > > unnecessary and reverted.
> > > >
> > > > But f3ac2ff14834 was too aggressive and broke platforms that didn't support
> > > > CLKREQ# or required device-specific configuration for L1 Substates, so
> > > > df5192d9bb0e ("PCI/ASPM: Enable only L0s and L1 for devicetree platforms")
> > > > enabled only L0s and L1.
> > > >
> > > > On Qualcomm platforms, this left L1 Substates disabled, which was a
> > > > regression. Revert a729c1664619 so L1 Substates will be enabled on devices
> > > > that are initially powered on. Devices powered on by pwrctrl will be
> > > > addressed later.
> ...
> > I have some heartburn about both the revert and the
> > pci_host_set_default_pcie_link_state() approach because they apply to
> > the entire hierarchy under a qcom or VMD root port, potentially
> > including add-in cards with switches. CLKREQ# (and possibly more) is
> > required to enable L1SS, and I don't know if we can assume it's
> > supported on add-in links.
>
> I don't think we can assume, but at the same time, I don't think we
> will ever be able to come up with a logical way to enable L1ss on
> all devices. But if we leave the decision to the host controller
> drivers, they can at least guarantee that the CLKREQ# and other
> requirements are satisfied from the host perspective for L1ss. Then,
> if any device exhibit erratic behavior, we will for sure know that
> the device is at fault and we can quirk them.
If we can figure out that an endpoint is defective, a quirk is great.
But the issue might be something in the path, e.g., some connector in
the path leading to the endpoint doesn't include CLKREQ#, and we can't
quirk the endpoint then.
To me it sounds like the mainline kernel should be safe and only
enable L1SS when it has a clear signal that it is safe, either via
devicetree, ACPI, or L1SS configuration inherited from firmware. I
don't want a future of telling users to boot with "pcie_aspm=off" if
a device doesn't work.
Enabling L1SS *without* such a clear signal feels like something
downstream kernels might have to do when they know more about the
topology.
Bjorn
Powered by blists - more mailing lists