lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.44L0.1009030953060.1548-100000@iolanthe.rowland.org>
Date:	Fri, 3 Sep 2010 10:04:14 -0400 (EDT)
From:	Alan Stern <stern@...land.harvard.edu>
To:	Colin Cross <ccross@...roid.com>
cc:	"Rafael J. Wysocki" <rjw@...k.pl>, <linux-kernel@...r.kernel.org>,
	<linux-pm@...ts.linux-foundation.org>, Pavel Machek <pavel@....cz>,
	Len Brown <len.brown@...el.com>,
	Greg Kroah-Hartman <gregkh@...e.de>,
	Randy Dunlap <randy.dunlap@...cle.com>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] PM: Prevent waiting forever on asynchronous resume after
 abort

On Thu, 2 Sep 2010, Colin Cross wrote:

> You're right, wait_event would be much worse.
> 
> I think there's another race condition during suspend.  If an
> asynchronous device calls device_pm_wait_for_dev on a device that
> hasn't had device_suspend called on it yet, power.completion will
> still be set from initialization or the last time it completed resume,
> and it won't wait.

That can't happen in a properly-designed system.  It would mean the 
async device didn't suspend because it was waiting for a device which 
was registered before it -- and that would deadlock even if you used 
synchronous suspend.

> Assuming that problem is fixed somehow, there's also a deadlock
> possibility.  Consider 3 devices.  A, B, and C, registered in that
> order.  A is async, and the suspend handler calls
> device_pm_wait_for_dev(C).  B's suspend handler returns an error.  A's
> suspend handler is now stuck waiting on C->power.completion, but
> device_suspend(C) will never be called.

Why not?  The normal suspend order is last-to-first, so C will be 
suspended before B.

> There are also an unhandled edge condition - what is the expected
> behavior for a call to device_pm_wait_for_dev on a device if the
> suspend handler for that device returns an error?  Currently, the
> calling device will continue as if the target device had suspended.

It looks like __device_suspend needs to set async_error.  Which means 
async_suspend doesn't need to set it.  This is indeed a bug.

> What about splitting power.completion into two flags,
> power.suspend_complete and power.resume_complete?
> power.resume_complete is initialized to 1, because the devices start
> resumed.  Clear power.suspend_complete for all devices at the
> beginning of dpm_suspend, and clear power.resume_complete for any
> device that is suspended at the beginning of dpm_resume.  The
> semantics of each flag is then always clear.  Any time between the
> beginning and end of dpm_suspend, waiting on any device's
> power.suspend_complete will block until that device is in suspend.
> Any time between the beginning and end of dpm_resume, waiting on
> power.resume_complete will block IFF the device is suspended.

How are you going to wait for these things?  With wait_event?  Didn't 
you say above that it would be worse than using completions?

> A solution to the 2nd and 3rd problems would still be needed - a way
> to abort drivers that call device_pm_wait_for_dev when suspend is
> aborted, and a return value to tell them the device being waited on is
> not suspended.

No solutions are needed.  See above.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ