lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240228220215.GA308296@bhelgaas>
Date: Wed, 28 Feb 2024 16:02:15 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Johan Hovold <johan@...nel.org>
Cc: Johan Hovold <johan+linaro@...nel.org>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	Bjorn Andersson <andersson@...nel.org>,
	Konrad Dybcio <konrad.dybcio@...aro.org>,
	Lorenzo Pieralisi <lpieralisi@...nel.org>,
	Krzysztof WilczyƄski <kw@...ux.com>,
	Rob Herring <robh@...nel.org>,
	Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
	Conor Dooley <conor+dt@...nel.org>,
	Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
	linux-arm-msm@...r.kernel.org, linux-pci@...r.kernel.org,
	devicetree@...r.kernel.org, linux-kernel@...r.kernel.org,
	stable@...r.kernel.org
Subject: Re: [PATCH v2 04/12] PCI: qcom: Add support for disabling ASPM L0s
 in devicetree

On Tue, Feb 27, 2024 at 04:29:15PM +0100, Johan Hovold wrote:
> On Fri, Feb 23, 2024 at 04:10:00PM -0600, Bjorn Helgaas wrote:
> > On Fri, Feb 23, 2024 at 04:21:16PM +0100, Johan Hovold wrote:
> > > Commit 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting
> > > 1.9.0 ops") started enabling ASPM unconditionally when the hardware
> > > claims to support it. This triggers Correctable Errors for some PCIe
> > > devices on machines like the Lenovo ThinkPad X13s, which could indicate
> > > an incomplete driver ASPM implementation or that the hardware does in
> > > fact not support L0s.
> > 
> > Are there any more details about this?  Do the errors occur around
> > suspend/resume, a power state transition, or some other event?  Might
> > other DWC-based devices be susceptible?  Is there a specific driver
> > you suspect might be incomplete?
> 
> I see these errors when the devices in question are active as well as
> idle (not during suspend/resume). For example, when running iperf3 or
> fio to test the wifi and nvme, but I also see this occasionally for a
> wifi device which is (supposedly) not active (e.g. a handful errors over
> night).
> 
> I skimmed Qualcomm's driver and noted that there are some registers
> related to ASPM which that driver updates, while the mainline driver
> leaves them at their default settings, but I essentially only mentioned
> that the ASPM implementation may be incomplete as a theoretical
> possibility. The somewhat erratic ASPM behaviour for one of the modems
> also suggests that some further tweak/quirk may be needed, and I was
> hoping to catch Mani's interest by reporting it.
> 
> But based on what I've since heard from Qualcomm, it seems like these
> correctable error may be a known issue with the hardware (e.g. seen
> also with Windows), which is also why we decided to disable it for all
> controllers on these two platforms where I've seen this in v2.
>  
> > Do you want the DT approach because the problem is believed to be
> > platform-specific?  Otherwise, maybe we should consider reverting
> > 9f4f3dfad8cf until the problem is understood?
> 
> Enabling ASPM gave a very significant improvement in battery life on the
> Lenovo ThinkPad X13s, from 10.5 h to 15 h, so reverting is not really an
> option there.

Ah, I missed that you're only disabling L0s, but leaving L1 enabled,
thanks!

And given that the v1.9.0 ops that enable ASPM are used on a bunch of
platforms, and L0s seems to work fine on most of them, we wouldn't
want to disable L0s for everybody, so this seems like the right
solution.

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ