lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f8965cfe-de9a-439c-84e3-63da066aa74f@rowland.harvard.edu>
Date: Fri, 29 Aug 2025 15:58:45 -0400
From: Alan Stern <stern@...land.harvard.edu>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
	ryan zhou <ryanzhou54@...il.com>, Roy Luo <royluo@...gle.com>,
	"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
	"linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>
Subject: Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate
 child device xxx.dwc3 but parent is not active

On Fri, Aug 29, 2025 at 09:23:12PM +0200, Rafael J. Wysocki wrote:
> On Fri, Aug 29, 2025 at 3:25 AM Alan Stern <stern@...land.harvard.edu> wrote:
> > It sounds like the real question is how we should deal with an
> > interrupted system suspend.  Suppose parent device A and child device B
> > are both in runtime suspend when a system sleep transition begins.  The
> > PM core invokes the ->suspend callback of B (and let's say the callback
> > doesn't need to do anything because B is already suspended with the
> > appropriate wakeup setting).
> >
> > But then before the PM core invokes the ->suspend callback of A, the
> > system sleep transition is cancelled.  So the PM core goes through the
> > device tree from parents to children, invoking the ->resume callback for
> > all the devices whose ->suspend callback was called earlier.  Thus, A's
> > ->resume is skipped because A's ->suspend wasn't called, but B's
> > ->resume callback _is_ invoked.  This callback fails, because it can't
> > resume B while A is still in runtime suspend.
> >
> > The same problem arises if A isn't a parent of B but there is a PM
> > dependency from B to A.
> >
> > It's been so long since I worked on the system suspend code that I don't
> > remember how we decided to handle this scenario.
> 
> We actually have not made any specific decision in that respect.  That
> is, in the error path, the core will invoke the resume callbacks for
> devices whose suspend callbacks were invoked and it won't do anything
> beyond that because it has too little information on what would need
> to be done.
> 
> Arguably, though, the failure case described above is not different
> from regular resume during which the driver of A decides to retain the
> device in runtime suspend.
> 
> I'm not sure if the core can do anything about it.
> 
> But at the time when the B's resume callback is invoked, runtime PM is
> enabled for A, so the driver of B may as well use runtime_resume() to
> resume the device if it wants to do so.  It may also decide to do
> nothing like in the suspend callback.

Good point.  Since both devices were in runtime suspend before the sleep 
transition started, there's no reason they can't remain in runtime 
suspend after the sleep transition is cancelled.

On the other hand, it seems clear that this scenario doesn't get very 
much testing.  I'm pretty sure the USB subsystem in general is 
vulnerable to this problem; it doesn't consider suspended devices to be 
in different states according to the reason for the suspend.  That is, a 
USB device suspended for runtime PM is in the same state as a device 
suspended for system PM (aside from minor details like wakeup settings).  
Consequently the ->resume and ->runtime_resume callbacks do essentially 
the same thing, both assuming the parent device is not suspended.  As we 
have discussed, this assumption isn't always correct.

I'm open to suggestions for how to handle this.  Should we keep track of 
whether a device was in runtime suspend when a system suspend happens, 
so that the ->resume callback can avoid doing anything?  Will that work 
if the device was the source of a wakeup request?

Alan Stern

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ