lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aH43Sm3LWoipx5Yn@hovoldconsulting.com>
Date: Mon, 21 Jul 2025 14:49:14 +0200
From: Johan Hovold <johan@...nel.org>
To: Manivannan Sadhasivam <mani@...nel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@....qualcomm.com>,
	Lorenzo Pieralisi <lpieralisi@...nel.org>,
	Krzysztof WilczyƄski <kwilczynski@...nel.org>,
	Rob Herring <robh@...nel.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
	linux-arm-msm@...r.kernel.org, linux-pci@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Krishna Chaitanya Chundru <krishna.chundru@....qualcomm.com>,
	stable@...r.kernel.org
Subject: Re: [PATCH 1/2] PCI: qcom: Switch to bus notifier for enabling ASPM
 of PCI devices

On Mon, Jul 21, 2025 at 04:26:41PM +0530, Manivannan Sadhasivam wrote:
> On Mon, Jul 21, 2025 at 11:32:44AM GMT, Johan Hovold wrote:
> > On Mon, Jul 14, 2025 at 11:31:04PM +0530, Manivannan Sadhasivam wrote:
> > > Commit 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0
> > > ops") allowed the Qcom controller driver to enable ASPM for all PCI devices
> > > enumerated at the time of the controller driver probe. It proved to be
> > > useful for devices already powered on by the bootloader as it allowed
> > > devices to enter ASPM without user intervention.
> > > 
> > > However, it could not enable ASPM for the hotplug capable devices i.e.,
> > > devices enumerated *after* the controller driver probe. This limitation
> > > mostly went unnoticed as the Qcom PCI controllers are not hotplug capable
> > > and also the bootloader has been enabling the PCI devices before Linux
> > > Kernel boots (mostly on the Qcom compute platforms which users use on a
> > > daily basis).
> > > 
> > > But with the advent of the commit b458ff7e8176 ("PCI/pwrctl: Ensure that
> > > pwrctl drivers are probed before PCI client drivers"), the pwrctrl driver
> > > started to block the PCI device enumeration until it had been probed.
> > > Though, the intention of the commit was to avoid race between the pwrctrl
> > > driver and PCI client driver, it also meant that the pwrctrl controlled PCI
> > > devices may get probed after the controller driver and will no longer have
> > > ASPM enabled. So users started noticing high runtime power consumption with
> > > WLAN chipsets on Qcom compute platforms like Thinkpad X13s, and Thinkpad
> > > T14s, etc...
> > 
> > Note the ASPM regression for ath11k/ath12k only happened in 6.15, so
> > commit b458ff7e8176 ("PCI/pwrctl: Ensure that pwrctl drivers are probed
> > before PCI client drivers") in 6.13 does not seem to be the immediate
> > culprit here.
> 
> This series was intented to fix the ASPM issue which exist even before the
> introduction of pwrctrl framework.

But this limitation of the ASPM enable implementation wasn't really an
issue before pwrctrl since, as you point out above, these controllers
are not hotplug capable.

> But I also agree that the below commits made
> the issue more visible and caused regression on platforms where WLAN used to
> work.
> 
> > Candidates from 6.15 include commits like
> > 
> > 	957f40d039a9 ("PCI/pwrctrl: Move creation of pwrctrl devices to pci_scan_device()")
> > 	2489eeb777af ("PCI/pwrctrl: Skip scanning for the device further if pwrctrl device is created")
> > 
> > This is probably related to the reports of these drivers sometimes
> > failing to probe with
> > 
> > 	ath12k_pci 0004:01:00.0: of_irq_parse_pci: failed with rc=134
> > 
> > after pwrctrl was merged, and which since 6.15 should instead result in
> > the drivers not probing at all (as we've discussed off list).
> 
> We discussed about the ASPM issue IIRC. The above mentioned of_irq_parse_pci
> could also be related to the fact that we are turning off the supplies after
> pci_dev destruction. For this issue, I guess the patch from Brian could be the
> fix:
> 
> https://lore.kernel.org/linux-pci/20250711174332.1.I623f788178c1e4c5b1a41dbfc8c7fa55966373c0@changeid/

We've also discussed the rc=134 issue, which appears to be due to some
pwrctrl race. IIRC, you thought it may be the bluetooth driver powering
down the bt/wlan controller before the wlan bit has had a chance to
(complete its) probe. Not sure if that was fully confirmed, but I
remember you saying that the rc=134 symptom would no longer be visible
since 6.15 and instead wlan would never even probe at all if we hit this
issue...

The patch you link to above only appears to relate to drivers being
manually unbound. I hope we're not also hitting such issues during
regular boot?

> > > Obviously, it is the pwrctrl change that caused regression, but it
> > > ultimately uncovered a flaw in the ASPM enablement logic of the controller
> > > driver. So to address the actual issue, switch to the bus notifier for
> > > enabling ASPM of the PCI devices. The notifier will notify the controller
> > > driver when a PCI device is attached to the bus, thereby allowing it to
> > > enable ASPM more reliably. It should be noted that the
> > > 'pci_dev::link_state', which is required for enabling ASPM by the
> > > pci_enable_link_state_locked() API, is only set by the time of
> > > BUS_NOTIFY_BIND_DRIVER stage of the notification. So we cannot enable ASPM
> > > during BUS_NOTIFY_ADD_DEVICE stage.
> > > 
> > > So with this, we can also get rid of the controller driver specific
> > > 'qcom_pcie_ops::host_post_init' callback.
> > > 
> > > Cc: stable@...r.kernel.org # v6.7
> > > Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops")
> > 
> > So whatever form this fix ends up taking it only needs to be backported
> > to 6.15.
> > 
> > As you mention above these platforms do not support hotplug, but even if
> > they were, enabling ASPM for hotplugged devices is arguably more of a
> > new features than a bug fix.
> 
> FYI, I'm going to drop this series in favor this (with one yet-to-be-submitted
> patch on top):
> https://lore.kernel.org/linux-pci/20250720190140.2639200-1-david.e.box@linux.intel.com/

Sounds good. Thanks.

Johan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ