lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250113162549.a2y7dlwnsfetryyw@thinkpad>
Date: Mon, 13 Jan 2025 21:55:49 +0530
From: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
To: rafael@...nel.org, ulf.hansson@...aro.org,
	Johan Hovold <johan@...nel.org>
Cc: Krishna Chaitanya Chundru <quic_krichai@...cinc.com>,
	Kevin Xie <kevin.xie@...rfivetech.com>,
	Lorenzo Pieralisi <lpieralisi@...nel.org>,
	Krzysztof Wilczyński <kw@...ux.com>,
	Rob Herring <robh@...nel.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
	Markus.Elfring@....de, quic_mrana@...cinc.com, rafael@...nel.org,
	m.szyprowski@...sung.com, linux-pm@...r.kernel.org,
	linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
	regressions@...ts.linux.dev
Subject: Re: [PATCH v7 2/2] PCI: Enable runtime pm of the host bridge

+ Ulf (for the runtime PM related question)

On Tue, Jan 07, 2025 at 03:27:59PM +0100, Johan Hovold wrote:
> On Tue, Jan 07, 2025 at 07:40:39PM +0530, Krishna Chaitanya Chundru wrote:
> > On 1/7/2025 6:49 PM, Johan Hovold wrote:
> 
> > >> @@ -3106,6 +3106,17 @@ int pci_host_probe(struct pci_host_bridge *bridge)
> > >>   		pcie_bus_configure_settings(child);
> > >>   
> > >>   	pci_bus_add_devices(bus);
> > >> +
> > >> +	/*
> > >> +	 * Ensure pm_runtime_enable() is called for the controller drivers,
> > >> +	 * before calling pci_host_probe() as pm frameworks expects if the
> > >> +	 * parent device supports runtime pm then it needs to enabled before
> > >> +	 * child runtime pm.
> > >> +	 */
> > >> +	pm_runtime_set_active(&bridge->dev);
> > >> +	pm_runtime_no_callbacks(&bridge->dev);
> > >> +	devm_pm_runtime_enable(&bridge->dev);
> > >> +
> > >>   	return 0;
> > >>   }
> > >>   EXPORT_SYMBOL_GPL(pci_host_probe);
> > > 
> > > I just noticed that this change in 6.13-rc1 is causing the following
> > > warning on resume from suspend on machines like the Lenovo ThinkPad
> > > X13s:
> 
> > Can you confirm if you are seeing this issue is seen in the boot-up
> > case also. As this part of the code executes only at the boot time and
> > will not have effect in resume from suspend.
> 
> No, I only see it during resume. And enabling runtime PM can (and in
> this case, obviously does) impact system suspend as well. 
> 
> > > 	pci0004:00: pcie4: Enabling runtime PM for inactive device with active children
> 
> > I believe this is not causing any functional issues.
> 
> It still needs to be fixed.
> 
> > > which may have unpopulated ports (this laptop SKU does not have a modem).
> 
> > Can you confirm if this warning goes away if there is some endpoint
> > connected to it.
> 
> I don't have anything to connect to the slot in this machine, but this
> seems to be the case as I do not see this warning for the populated
> slots, nor on the CRD reference design which has a modem on PCIe4.
> 

Yes, this is only happening for unpopulated slots and the warning shows up only
if runtime PM is enabled for both PCI bridge and host bridge. This patch enables
the runtime PM for host bridge and if the PCI bridge runtime PM is also enabled
(only happens now for ACPI/BIOS based platforms), then the warning shows up only
if the PCI bridge was RPM suspended (mostly happens if there was no device
connected) during the system wide resume time.

For the sake of reference, PCI host bridge is the parent of PCI bridge.

Looking at where the warning gets triggered (in pm_runtime_enable()), we have
the below checks:

dev->power.runtime_status == RPM_SUSPENDED
!dev->power.ignore_children
atomic_read(&dev->power.child_count) > 0

When pm_runtime_enable() gets called for PCI host bridge:

dev->power.runtime_status = RPM_SUSPENDED
dev->power.ignore_children = 0
dev->power.child_count = 1

First 2 passes seem legit, but the issue is with the 3rd one. Here, the
child_count of 1 means that the PCI host bridge has an 'active' child (which is
the PCI bridge). The PCI bridge was supposed to be RPM_SUSPENDED as the resume
process should first resume the parent (PCI host bridge). But this is not the
case here.

Then looking at where the child_count gets incremented, it leads to
pm_runtime_set_active() of device_resume_noirq(). pm_runtime_set_active() is
only called for a device if dev_pm_skip_suspend() succeeds, which requires
DPM_FLAG_SMART_SUSPEND flag to be set and the device to be runtime suspended.

This criteria matches for PCI bridge. So its status was set to 'RPM_ACTIVE' even
though the parent PCI host bridge was still in the RPM_SUSPENDED state. I don't
think this is a valid condition as seen from the warning triggered for PCI host
bridge when pm_runtime_enable() is called from device_resume_early():

pci0004:00: pcie4: Enabling runtime PM for inactive device with active children

I'm not sure of what the fix is in this case. But removing the
DPM_FLAG_SMART_SUSPEND flag from PCI bridge driver (portdrv) makes the warning
go away. This indicates that something is wrong with the DPM_FLAG_SMART_SUSPEND
flag handling in PM core.

Ulf/Rafael, thoughts?

- Mani

-- 
மணிவண்ணன் சதாசிவம்

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ