lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250923232712.GA2092207@bhelgaas>
Date: Tue, 23 Sep 2025 18:27:12 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: Brian Norris <briannorris@...omium.org>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>, linux-kernel@...r.kernel.org,
	linux-pci@...r.kernel.org, stable@...r.kernel.org,
	Ethan Zhao <etzhao1900@...il.com>,
	Andrey Ryabinin <ryabinin.a.a@...il.com>
Subject: Re: [PATCH] PCI/sysfs: Ensure devices are powered for config reads

On Tue, Sep 23, 2025 at 04:07:29PM -0700, Brian Norris wrote:
> On Tue, Sep 23, 2025 at 02:02:31PM -0500, Bjorn Helgaas wrote:
> > On Wed, Aug 20, 2025 at 10:26:08AM -0700, Brian Norris wrote:
> > > From: Brian Norris <briannorris@...gle.com>
> > > 
> > > max_link_speed, max_link_width, current_link_speed, current_link_width,
> > > secondary_bus_number, and subordinate_bus_number all access config
> > > registers, but they don't check the runtime PM state. If the device is
> > > in D3cold, we may see -EINVAL or even bogus values.
> > > 
> > > Wrap these access in pci_config_pm_runtime_{get,put}() like most of the
> > > rest of the similar sysfs attributes.
> > 
> > Protecting the config reads seems right to me.
> > 
> > If the device is in D3cold, a config read will result in a Completion
> > Timeout.  On most x86 platforms that's "fine" and merely results in ~0
> > data.  But that's merely convention, not a PCIe spec requirement.
> > 
> > I think it's a potential issue with PCIe controllers used on arm64 and
> > might result in an SError or synchronous abort from which we don't
> > recover well.  I'd love to hear actual experience about how reading
> > "current_link_speed" works on a device in D3cold in an arm64 system.
> 
> I'm working on a few such arm64 systems :) (pcie-qcom Chromebooks, and
> non-upstream DWC-based Pixel phones; I have a little more knowledge of
> the latter.) The answers may vary by SoC, and especially by PCIe
> implementation. ARM SoCs are notoriously ... diverse.
> 
> To my knowledge, it can be several of the above on arm64 + DWC.
> 
> * pci_generic_config_read() -> pci_ops::map_bus() may return NULL, in
>   which case we get PCIBIOS_DEVICE_NOT_FOUND / -EINVAL. And specifically
>   on arm64 with DWC PCIe, dw_pcie_other_conf_map_bus() may see the link
>   down on a suspended bridge and return NULL.
> 
> * The map_bus() check is admittedly racy, so we might still *actually*
>   hit the hardware, at which point this gets even more
>   implementation-defined:
> 
>   (a) if the PCIe HW is not powered (for example, if we put the link to
>       L3 and fully powered off the controller to save power), we might
>       not even get a completion timeout, and it depends on how the
>       SoC is wired up. But I believe this tends to be SError, and a
>       crash.
> 
>   (b) if the PCIe HW is powered but something else is down (e.g., link
>       in L2, device in D3cold, etc.), we'll get a Completion Timeout,
>       and a ~0 response. I also was under the impression a ~0 response
>       is not spec-mandated, but I believe it's noted in the Synopsys
>       documentation.

The ~0 response is not required by the PCIe spec, although there's at
least one implementation note to the effect that a Root Complex
intended for use with software that depends on ~0 data when a config
request fails with Unsupported Request must synthesize that value
(this one is from PCIe r7.0, sec 2.3.2).

> NB: I'm not sure there is really great upstream support for arm64 +
> D3cold yet. If they're not using ACPI (as few arm64 systems do), they
> probably don't have the appropriate platform_pci_* hooks to really
> manage it properly. There have been some prior attempts at adding
> non-x86/ACPI hooks for this, although for different reasons:
> 
>     https://lore.kernel.org/linux-pci/a38e76d6f3a90d7c968c32cee97604f3c41cbccf.camel@mediatek.com/
>     [PATCH] PCI:PM: Support platforms that do not implement ACPI
> 
> That submission stalled because it didn't really have the whole picture
> (in that case, the wwan/modem driver in question).
> 
> > As Ethan and Andrey pointed out, we could skip max_link_speed_show()
> > because pcie_get_speed_cap() already uses a cached value and doesn't
> > do a config access.
> 
> Ack, I'll drop that part of the change.
> 
> > max_link_width_show() is similar and also comes from PCI_EXP_LNKCAP
> > but is not currently cached, so I think we do need that one.  Worth a
> > comment to explain the non-obvious difference.
> 
> Sure, I'll add a comment for max_link_width.
> 
> > PCI_EXP_LNKCAP is ostensibly read-only and could conceivably be
> > cached, but the ASPM exit latencies can change based on the Common
> > Clock Configuration.
> 
> I'll plan not to add additional caching, unless excess wakeups becomes a
> problem.

Perfect, thanks, I'll watch for this.

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ