lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250128155830.xc6y2swqqw5okt32@thinkpad>
Date: Tue, 28 Jan 2025 21:28:30 +0530
From: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Ulf Hansson <ulf.hansson@...aro.org>, Johan Hovold <johan@...nel.org>,
	Krishna Chaitanya Chundru <quic_krichai@...cinc.com>,
	Kevin Xie <kevin.xie@...rfivetech.com>,
	Lorenzo Pieralisi <lpieralisi@...nel.org>,
	Krzysztof Wilczyński <kw@...ux.com>,
	Rob Herring <robh@...nel.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
	Markus.Elfring@....de, quic_mrana@...cinc.com,
	m.szyprowski@...sung.com, linux-pm@...r.kernel.org,
	linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
	regressions@...ts.linux.dev
Subject: Re: [PATCH v7 2/2] PCI: Enable runtime pm of the host bridge

On Tue, Jan 28, 2025 at 12:47:09PM +0100, Rafael J. Wysocki wrote:
> On Mon, Jan 27, 2025 at 8:57 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
> >
> > On Mon, Jan 27, 2025 at 3:32 PM Ulf Hansson <ulf.hansson@...aro.org> wrote:
> > >
> > > On Mon, 13 Jan 2025 at 17:25, Manivannan Sadhasivam
> > > <manivannan.sadhasivam@...aro.org> wrote:
> > > >
> > > > + Ulf (for the runtime PM related question)
> > > >
> > > > On Tue, Jan 07, 2025 at 03:27:59PM +0100, Johan Hovold wrote:
> > > > > On Tue, Jan 07, 2025 at 07:40:39PM +0530, Krishna Chaitanya Chundru wrote:
> > > > > > On 1/7/2025 6:49 PM, Johan Hovold wrote:
> > > > >
> > > > > > >> @@ -3106,6 +3106,17 @@ int pci_host_probe(struct pci_host_bridge *bridge)
> > > > > > >>                  pcie_bus_configure_settings(child);
> > > > > > >>
> > > > > > >>          pci_bus_add_devices(bus);
> > > > > > >> +
> > > > > > >> +        /*
> > > > > > >> +         * Ensure pm_runtime_enable() is called for the controller drivers,
> > > > > > >> +         * before calling pci_host_probe() as pm frameworks expects if the
> > > > > > >> +         * parent device supports runtime pm then it needs to enabled before
> > > > > > >> +         * child runtime pm.
> > > > > > >> +         */
> > > > > > >> +        pm_runtime_set_active(&bridge->dev);
> > > > > > >> +        pm_runtime_no_callbacks(&bridge->dev);
> > > > > > >> +        devm_pm_runtime_enable(&bridge->dev);
> > > > > > >> +
> > > > > > >>          return 0;
> > > > > > >>   }
> > > > > > >>   EXPORT_SYMBOL_GPL(pci_host_probe);
> > > > > > >
> > > > > > > I just noticed that this change in 6.13-rc1 is causing the following
> > > > > > > warning on resume from suspend on machines like the Lenovo ThinkPad
> > > > > > > X13s:
> > > > >
> > > > > > Can you confirm if you are seeing this issue is seen in the boot-up
> > > > > > case also. As this part of the code executes only at the boot time and
> > > > > > will not have effect in resume from suspend.
> > > > >
> > > > > No, I only see it during resume. And enabling runtime PM can (and in
> > > > > this case, obviously does) impact system suspend as well.
> > > > >
> > > > > > >   pci0004:00: pcie4: Enabling runtime PM for inactive device with active children
> > > > >
> > > > > > I believe this is not causing any functional issues.
> > > > >
> > > > > It still needs to be fixed.
> > > > >
> > > > > > > which may have unpopulated ports (this laptop SKU does not have a modem).
> > > > >
> > > > > > Can you confirm if this warning goes away if there is some endpoint
> > > > > > connected to it.
> > > > >
> > > > > I don't have anything to connect to the slot in this machine, but this
> > > > > seems to be the case as I do not see this warning for the populated
> > > > > slots, nor on the CRD reference design which has a modem on PCIe4.
> > > > >
> > > >
> > > > Yes, this is only happening for unpopulated slots and the warning shows up only
> > > > if runtime PM is enabled for both PCI bridge and host bridge. This patch enables
> > > > the runtime PM for host bridge and if the PCI bridge runtime PM is also enabled
> > > > (only happens now for ACPI/BIOS based platforms), then the warning shows up only
> > > > if the PCI bridge was RPM suspended (mostly happens if there was no device
> > > > connected) during the system wide resume time.
> > > >
> > > > For the sake of reference, PCI host bridge is the parent of PCI bridge.
> > > >
> > > > Looking at where the warning gets triggered (in pm_runtime_enable()), we have
> > > > the below checks:
> > > >
> > > > dev->power.runtime_status == RPM_SUSPENDED
> > > > !dev->power.ignore_children
> > > > atomic_read(&dev->power.child_count) > 0
> > > >
> > > > When pm_runtime_enable() gets called for PCI host bridge:
> > > >
> > > > dev->power.runtime_status = RPM_SUSPENDED
> > > > dev->power.ignore_children = 0
> > > > dev->power.child_count = 1
> > > >
> > > > First 2 passes seem legit, but the issue is with the 3rd one. Here, the
> > > > child_count of 1 means that the PCI host bridge has an 'active' child (which is
> > > > the PCI bridge). The PCI bridge was supposed to be RPM_SUSPENDED as the resume
> > > > process should first resume the parent (PCI host bridge). But this is not the
> > > > case here.
> > > >
> > > > Then looking at where the child_count gets incremented, it leads to
> > > > pm_runtime_set_active() of device_resume_noirq(). pm_runtime_set_active() is
> > > > only called for a device if dev_pm_skip_suspend() succeeds, which requires
> > > > DPM_FLAG_SMART_SUSPEND flag to be set and the device to be runtime suspended.
> > > >
> > > > This criteria matches for PCI bridge. So its status was set to 'RPM_ACTIVE' even
> > > > though the parent PCI host bridge was still in the RPM_SUSPENDED state. I don't
> > > > think this is a valid condition as seen from the warning triggered for PCI host
> > > > bridge when pm_runtime_enable() is called from device_resume_early():
> > > >
> > > > pci0004:00: pcie4: Enabling runtime PM for inactive device with active children
> > >
> > > Thanks for the detailed analysis, much appreciated.
> > >
> > > So this seems to boil down to the fact that the PM core calls
> > > pm_runtime_set_active() for a device, when it really should not. If
> > > there is a clever way to avoid that, I think we need Rafael's opinion
> > > on.
> >
> > Actually, not really.
> >
> > The status of the child and the child count of the parent have no
> > meaning until runtime PM is enabled for the parent.  They can be
> > manipulated freely before this happens with no consequences and all
> > will be fine as long as those settings are consistent when runtime PM
> > is enabled for the parent.
> >
> > Now, they aren't consistent at that point because
> > dev_pm_skip_suspend() returns false for the parent as it has
> > DPM_FLAG_SMART_SUSPEND clear.
> >
> > To me, this looks like a coding mistake because all devices that have
> > power.must_resume set should also be set to RPM_ACTIVE before
> > re-enabling runtime PM for them, so the attached patch should work.
> 
> Having reflected on it a bit I think that it's better to avoid
> changing the existing behavior too much, so attached is a new version
> of the patch.
> 
> It is along the same lines as before, but it doesn't go as far as the
> previous version.  Namely, in addition to what the existing code does,
> it will cause the runtime PM status to be set to RPM_ACTIVE for the
> devices whose dependents will have it set which should address the
> problem at hand if I'm not mistaken.
> 
> I'd appreciated giving it a go on a system where the warning is printed.
> 

This patch indeed makes the warning go away and I don't spot any other issues.
So you can add my Tested-by tag while submitting the fix.

Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>

Thanks a lot!

- Mani

-- 
மணிவண்ணன் சதாசிவம்

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ