lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <75av4b6g4slterg5ctdxkevaxvjsymealok47x7tjwv5etqwmj@rotc7nfhhgi5>
Date: Fri, 9 Jan 2026 13:36:32 -0600
From: Bjorn Andersson <andersson@...nel.org>
To: Manivannan Sadhasivam <manivannan.sadhasivam@....qualcomm.com>
Cc: Manivannan Sadhasivam <mani@...nel.org>, 
	Ulf Hansson <ulf.hansson@...aro.org>, konradybcio@...nel.org, linux-arm-msm@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/1] soc: qcom: rpmh-rsc: Register s2idle_ops to
 indicate s2ram behavior in s2idle

On Thu, Jan 08, 2026 at 10:55:16AM +0530, Manivannan Sadhasivam wrote:
> On Tue, Jan 06, 2026 at 11:21:59AM -0600, Bjorn Andersson wrote:
> > On Mon, Jan 05, 2026 at 10:27:41PM +0530, Manivannan Sadhasivam wrote:
> > > On Mon, Jan 05, 2026 at 08:04:54AM -0600, Bjorn Andersson wrote:
> > > > On Thu, Jan 01, 2026 at 10:21:57PM +0530, Manivannan Sadhasivam wrote:
> > > > > Hi,
> > > > > 
> > > > 
> > > > Many thanks for looking into this problem, Mani.
> > > > 
> > > > > This is just an attempt to let the device drivers know of the quirky platform
> > > > > behavior of mimicking s2ram in s2idle for older Qcom SoCs (pre-Hamoa and
> > > > > non-chromebooks) using RPMh. This information is important for the device
> > > > > drivers as they need to prepare for the possible power loss during system
> > > > > suspend by shutting down or resetting the devices.
> > > > > 
> > > > 
> > > > Is this really what happens? What _exactly_ is the purpose of
> > > > pm_suspend_via_firmware() and pm_resume_via_firmware()?
> > > > 
> > > > The kernel-doc for pm_suspend_via_firmware() states "return true if
> > > > platform firmware is going to be invoked at end of system suspend...in
> > > > order to complete it".
> > > > 
> > > > To me this indicates that the purpose of this mechanism is to signal
> > > > Linux that we're running on a target where platform firmware will cut
> > > > power once we reach the bottom of the suspend chain, so the individual
> > > > drivers has to save/restore state in order to prepare for this.
> > > > 
> > > 
> > > Yes!
> > > 
> > 
> > But that's not the case here. In fact, iirc even on Makena the PCIe
> > controller is retained, it's just the link that can't be sustained
> > without CX - but I might misremember.
> > 
> 
> There is no MX retention for PCIe on Makena.
> 

Okay, thanks for correcting my memory.

[..]
> > > We tried to fix the issue in the NVMe driver so far, but all efforts ended up in
> > > vain. NVMe maintainers preferred to rely on some PCI/PM APIs for this. Initially
> > > I looked into creating one such API, but then figured out that I can just make
> > > use of this global flag and not touch the NVMe driver at all.
> > > 
> > 
> > At the same time acpi_storage_d3() tells me that this problem is already
> > accepted by the community, we just don't have a way to signal the same
> > in our system.
> > 
> 
> There are many solutions initially accepted for x86/ACPI which are not getting
> accepted now.
> 

It's inspiring to see the whole x86 ecosystem moving beyond such issues.

> > > This worked well for s2ram, so I resent a patch submitted by Konrad in 2024 [1].
> > > But for s2idle and Makena, I thought I could reuse the same global flag and
> > > achieve the same net result.
> > > 
> > 
> > My problem remains that we're using the global "power to all IP-blocks
> > will be lost"-flag to say "on Makena, the PCIe controller doesn't retain
> > state in low power mode", to a single driver.
> > 
> 
> I agree with your concern. If not global, we need to either fix it in PCI core
> or in the NVMe driver. Unfortunately, for an one-off issue like this, making
> change in both proves difficult. As you may know, our initial work was to get it
> fixed in the NVMe driver, but that didn't fly.
> 

The fact that the PCIe controller is "broken" does not imply that the
whole SoC is broken, and claiming so will come with side effects.

[..]
> > > > Do we force all "pre-Hamoa" targets into this same behavior? Or do we
> > > > have a different flag for saying "at the end of suspend power will be
> > > > lost" there?
> > > > 
> > > 
> > > If you don't agree with the above reasoning, I can just limit its usage to
> > > Makena and s2ram.
> > > 
> > 
> > I don't think I have all the details, but if we're saying that Makena is
> > broken and need special treatment in NVMe, it would be better to say
> > just that, with a patch in check_vendor_combination_bug()
> > 
> > /* Qualcomm SC8280XP can not enter low-power with PCIe link active */
> > if (of_machine_is_compatible("qcom,sc8280xp"))
> > 	return NVME_QUIRK_SIMPLE_SUSPEND;
> > 
> 
> I tried it in multiple attempts. Recent one is this:
> https://lore.kernel.org/all/20250126050309.7243-1-manivannan.sadhasivam@linaro.org/
> 
> And here is the reply:
> https://lore.kernel.org/all/20250210040446.GA2823@lst.de/

I presume "core" doesn't necessarily imply "system-wide", but you know
the PCIe stack better than me, so I'm not sure where in the "core" this
would be better implemented.

> 
> > 
> > Alternatively, make it possible to set StorageD3Enable in DeviceTree.
> > This flag exists in ACPI for a reason, why wouldn't we see the same
> > problems when describing the system using DeviceTree?
> > 
> 
> This is not a ACPI spec defined object, but just a MSFT way of telling the OS
> when to enter D3 (D3Hot or D3Cold) for storage devices. If we want to go with
> the DT property, it has to be something like 'qcom,no-retention'. I don't think
> a generic property makes sense here.
> 

Unless I'm reading the nvme driver incorrectly, I don't think the need
for such property is vendor-specific. But conceptually it doesn't really
belong in system description, given that the user might swap NVMe and
introduce/remove the need for this quirk.

> Even then, we would need a PCI API to translate that for client drivers. I can
> look into this direction.
> 

I hope that we can agree that this is a property of the PCIe controller,
for Makena. If we can't move the problem with NVMe on Makena out of the
way of progress, how about we work our way around it (for now
hopefully...)?

Change qcom_pcie_suspend_noirq() to explicitly carry a vote for CX (e.g.
using the bw-vote as today), describe the problem explictily in a
comment, and then move forward with releasing the CX vote for all other
targets.

Regards,
Bjorn

> - Mani
> 
> -- 
> மணிவண்ணன் சதாசிவம்

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ