[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <qu3gnvl7t7ehpxhkchz6ragjoeafktwr4dtstattthfv3jezd7@zrfwrlr2vzx5>
Date: Mon, 17 Nov 2025 23:07:01 +0530
From: Manivannan Sadhasivam <mani@...nel.org>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@....qualcomm.com>,
lpieralisi@...nel.org, kwilczynski@...nel.org, bhelgaas@...gle.com, will@...nel.org,
linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org, robh@...nel.org,
linux-arm-msm@...r.kernel.org, zhangsenchuan@...incomputing.com, vincent.guittot@...aro.org
Subject: Re: [PATCH v2 2/3] PCI: qcom: Check for the presence of a device
instead of Link up during suspend
On Thu, Nov 13, 2025 at 11:22:50AM -0600, Bjorn Helgaas wrote:
> On Thu, Nov 13, 2025 at 10:24:17PM +0530, Manivannan Sadhasivam wrote:
> > On Thu, Nov 13, 2025 at 10:41:47AM -0600, Bjorn Helgaas wrote:
> > > On Fri, Nov 07, 2025 at 10:13:18AM +0530, Manivannan Sadhasivam wrote:
> > > > The suspend handler checks for the PCIe Link up to decide when to turn off
> > > > the controller resources. But this check is racy as the PCIe Link can go
> > > > down just after this check.
> > > >
> > > > So use the newly introduced API, pci_root_ports_have_device() that checks
> > > > for the presence of a device under any of the Root Ports to replace the
> > > > Link up check.
> > >
> > > Why is pci_root_ports_have_device() itself not racy?
> >
> > Because it is very uncommon for the 'pci_dev' to go away during the
> > host controller suspend. It might still be possible in edge cases,
> > but very common as the link down. I can reword it.
>
> I guess it's better to acknowledge replacing one race with another
> than it would be to suggest that this *removes* a race.
>
Ok.
> But I don't understand the point of this. Is
> pci_root_ports_have_device() *less* racy than the
> qcom_pcie_suspend_noirq() check? Why would that be?
>
The check is supposed to perform deinit only if there are no devices connected
to the slot. And the reason to skip the deinit was mostly due to driver behavior
like NVMe driver, which expects the device to be in D0 even during system
suspend on non-x86 platforms.
Since the check is for the existence of the device nevertheless, I thought,
making use of pci_root_ports_have_device() serves the purpose instead of
checking the data link layer status.
> I'm kind of skeptical about adding pci_root_ports_have_device() at
> all. It seems like it just encourages racy behavior in drivers.
>
I agree that though it is not very common, but with async suspend, it is
possible that 'pci_dev' may get removed during controller suspend.
So I've dropped this series from controller/dwc until we conclude.
- Mani
> > > > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@....qualcomm.com>
> > > > ---
> > > > drivers/pci/controller/dwc/pcie-qcom.c | 6 ++++--
> > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
> > > > index 805edbbfe7eb..b2b89e2e4916 100644
> > > > --- a/drivers/pci/controller/dwc/pcie-qcom.c
> > > > +++ b/drivers/pci/controller/dwc/pcie-qcom.c
> > > > @@ -2018,6 +2018,7 @@ static int qcom_pcie_probe(struct platform_device *pdev)
> > > > static int qcom_pcie_suspend_noirq(struct device *dev)
> > > > {
> > > > struct qcom_pcie *pcie;
> > > > + struct dw_pcie_rp *pp;
> > > > int ret = 0;
> > > >
> > > > pcie = dev_get_drvdata(dev);
> > > > @@ -2053,8 +2054,9 @@ static int qcom_pcie_suspend_noirq(struct device *dev)
> > > > * powerdown state. This will affect the lifetime of the storage devices
> > > > * like NVMe.
> > > > */
> > > > - if (!dw_pcie_link_up(pcie->pci)) {
> > > > - qcom_pcie_host_deinit(&pcie->pci->pp);
> > > > + pp = &pcie->pci->pp;
> > > > + if (!pci_root_ports_have_device(pp->bridge->bus)) {
> > > > + qcom_pcie_host_deinit(pp);
> > > > pcie->suspended = true;
> > > > }
> > > >
> > > > --
> > > > 2.48.1
> > > >
> >
> > --
> > மணிவண்ணன் சதாசிவம்
--
மணிவண்ணன் சதாசிவம்
Powered by blists - more mailing lists