linux-kernel - Re: PCIe link training and pwrctrl sequence

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <rz6ajnl7l25hfl2u7lloywtw7sq7smhb63hg76wjslyuwyjb7a@fhuafuino5kv>
Date: Wed, 5 Nov 2025 15:01:03 +0530
From: Manivannan Sadhasivam <mani@...nel.org>
To: Chen-Yu Tsai <wens@...nel.org>
Cc: Bartosz Golaszewski <brgl@...ev.pl>, 
	Bjorn Helgaas <bhelgaas@...gle.com>, linux-kernel <linux-kernel@...r.kernel.org>, 
	linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>, PCI <linux-pci@...r.kernel.org>, 
	"open list:THERMAL" <linux-pm@...r.kernel.org>
Subject: Re: PCIe link training and pwrctrl sequence

On Wed, Oct 22, 2025 at 12:39:41AM +0800, Chen-Yu Tsai wrote:
> (recipient list trimmed down and added PCI & pwrctrl maintainers and lists)
> 
> On Tue, Oct 21, 2025 at 8:54 PM Manivannan Sadhasivam <mani@...nel.org> wrote:
> >
> > On Tue, Oct 21, 2025 at 02:22:46PM +0200, Bartosz Golaszewski wrote:
> > > On Tue, Oct 21, 2025 at 2:20 PM Manivannan Sadhasivam <mani@...nel.org> wrote:
> > > >
> > > > >
> > > > > And with the implementation this series proposes it would mean that
> > > > > the perst signal will go high after the first endpoint pwrctl driver
> > > > > sets it to high and only go down once the last driver sets it to low.
> > > > > The only thing I'm not sure about is the synchronization between the
> > > > > endpoints - how do we wait for all of them to be powered-up before
> > > > > calling the last gpiod_set_value()?
> > > > >
> > > >
> > > > That will be handled by the pwrctrl core. Not today, but in the coming days.
> > > >
> > >
> > > But is this the right approach or are you doing it this way *because*
> > > there's no support for enable-counted GPIOs as of yet?
> > >
> >
> > This is the right approach since as of today, pwrctrl core scans the bus, tries
> > to probe the pwrctrl driver (if one exists for the device to be scanned), powers
> > it ON, and deasserts the PERST#. If the device is a PCI bridge/switch, then the
> > devices underneath the downstream bus will only be powered ON after the further
> > rescan of the downstream bus. But the pwrctrl drivers for those devices might
> > get loaded at any time (even after the bus rescan).
> >
> > This causes several issues with the PCI core as this behavior sort of emulates
> > the PCI hot-plug (devices showing up at random times after bus scan). If the
> > upstream PCI bridge/switch is not hot-plug capable, then the devices that were
> > showing up later will fail to enumerate due to lack of resources. The failure
> > is due to PCI core limiting the resources for non hot-plug PCI bridges as it
> > doesn't expect the devices to show up later in the downstream port.
> 
> Side note:
> 
> Today I was looking into how the PCI core does slot pwrctrl, and it doesn't
> really work for some of the PCI controller drivers.
> 
> The pwrctrl stuff happens after the driver adds the host bus bridge.
> However drivers are doing link training before that. If the power is
> not on, link training will fail, and the driver errors out. It never
> has a chance to get to pwrctrl.
> 
> I wonder if some bits should be split out so they could be interleaved with
> link management on the host side. AFAICT only dwc and qcom will rescan the
> bus when an interrupt says the link is up. Other controllers might not have
> such an interrupt notification. I was looking at the MediaTek gen3 driver
> specifically.
> 

This is a known issue. With the initial design of the pwrctrl framework, we
thought that the pwrctrl devices should be created without controller
intervention. But it is proving to be wrong as some controllers expect the
devices to show up before the PHY initialization, as yours.

We are working on a series (almost complete, just needs cleanup) that moves the
pwrctrl creation to a new exported API and allows the controller drivers to call
the API from whereever they want based on the requirement. This series also
fixes the above mentioned hotplug/PERST# issues by scanning all the PCIe nodes
in one shot and creating pwrctrl devices and making use all of them gets probed
(just the pwrctrl drivers, not PCIe client drivers) before deasserting PERST#
and then scanning the bus. This would require all the pwrctrl drivers to be
loaded before probing the controller driver (otherwise, probe deferral will
happen), but that's a valid dependency.

This will allow the PCI core to find all the devices during the initial bus
scan and will fix the resource allocation issue.

- Mani

-- 
மணிவண்ணன் சதாசிவம்