lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0jcWQ3QKx=2nzDpnYPyGuYfT4TModwdAreWZu4d0hXmoA@mail.gmail.com>
Date: Wed, 7 May 2025 15:39:55 +0200
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Jon Hunter <jonathanh@...dia.com>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, "Rafael J. Wysocki" <rjw@...ysocki.net>, 
	Linux PM <linux-pm@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>, 
	Alan Stern <stern@...land.harvard.edu>, Ulf Hansson <ulf.hansson@...aro.org>, 
	Johan Hovold <johan@...nel.org>, Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>, 
	Saravana Kannan <saravanak@...gle.com>, 
	"linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>
Subject: Re: [PATCH v3 1/5] PM: sleep: Resume children after resuming the parent

Hi Jon,

On Wed, May 7, 2025 at 3:21 PM Jon Hunter <jonathanh@...dia.com> wrote:
>
> Hi Rafael,
>
> On 02/05/2025 21:33, Rafael J. Wysocki wrote:
>
> ...
>
> >> I have noticed a suspend regression with -next on a couple of our Tegra
> >> boards. Bisect was pointing to the following merge commit ...
> >>
> >> # first bad commit: [218a7bbf861f83398ac9767620e91983e36eac05] Merge
> >> branch 'pm-sleep' into linux-next
> >>
> >> On top of next-20250429 I found that by reverting the following changes
> >> that suspend is working again ...
> >>
> >> Revert "PM: sleep: Resume children after resuming the parent"
> >> Revert "PM: sleep: Suspend async parents after suspending children"
> >> Revert "PM: sleep: Make suspend of devices more asynchronous"
> >
> > I see.
> >
> > Do all three commits need to be reverted to make things work again?
> > The first one only touches the resume path, so it would be surprising
> > if it caused a suspend regression to occur.
>
> I had to revert the other 2 patches for the kernel to build. I tried to
> only revery this patch, and fix the build issue by defining the missing
> function and mutex, but that did not seem to work. The build worked, but
> suspend still failed. I am not sure if that is conclusive though.

The "PM: sleep: Resume children after resuming the parent" patch is
unlikely to introduce the issue, so you should not need to revert it.

The issue is probably introduced by "PM: sleep: Suspend async parents
after suspending children".

> >
> > The most likely commit to cause this issue to happen is the second one
> > because it effectively changes the suspend ordering for "async"
> > devices.
> >
> >> I have been looking into this a bit more to see what device is failing
> >> and by adding a bit of debug I found that entry to suspend was failing
> >> on the Tegra194 Jetson AGX Xavier (tegra194-p2972-0000.dts) platform
> >> when one of the I2C controllers (i2c@...0000) was being suspended.
> >>
> >> I found that if I disable only this I2C controller in device-tree
> >> suspend worked again on top of -next. This I2C controller has 3 devices
> >> on the platform; two ina3221 devices and one Cypress Type-C controller.
> >> I then found that removing only the two ina3221 devices (in
> >> tegra194-p2888.dtsi) also allows suspend to work.
> >>
> >> At this point, I am still unclear why this is now failing.  If you have
> >> any thoughts or things I can try please let me know.
> >
> > So are the devices in question "async"?  To check this, please see the
> > "async" attribute in the "power" subdirectory of the sysfs device
> > directory for each of them.
>
> I checked for both the I2C controller and ina3221 and don't see any
> 'async' files ...
>
> $ ls /sys/class/i2c-dev/i2c-2/device/2-0040/power/
> autosuspend_delay_ms  runtime_active_time  runtime_suspended_time
> control               runtime_status
> $ ls /sys/class/i2c-dev/i2c-2/device/2-0041/power/
> autosuspend_delay_ms  runtime_active_time  runtime_suspended_time
> control               runtime_status
> $ ls /sys/class/i2c-dev/i2c-2/power/
> autosuspend_delay_ms  runtime_active_time  runtime_suspended_time
> control               runtime_status

You need to set CONFIG_PM_ADVANCED_DEBUG to see those (and other debug
attributes).

> > If they are "async", you can write "disable" to this attribute to turn
> > them into "sync" devices.  I'd do this and see what happens.

You may also turn off async suspend altogether:

# echo 0 > /sys/power/pm_async

and see if this helps.

> >
> > Overall, it looks like some dependencies aren't properly represented
> > by device links on this platform.
>
> Yes that would appear to be the case, but at the moment, I don't see
> what it is. The ina3221 devices appear to suspend fine AFAICT, but hangs
> when suspending I2C controller. Exactly where is still a mystery.

So if it works with async suspend disabled, I would check the ordering
in which devices are suspended in that case (probably requires running
with initcall_debug in the kernel command line and echoing 1 to
/sys/power/pm_debug_messages).

Then compare it to the ordering in which devices are suspended with
async suspend enabled (for that, I would hack the I2C controller
suspend return an error early to avoid having to reboot the board
after it hangs).  All of the differences would be suspicious.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ