lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <201202252201.13537.rjw@sisk.pl>
Date:	Sat, 25 Feb 2012 22:01:13 +0100
From:	"Rafael J. Wysocki" <rjw@...k.pl>
To:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
Cc:	John Stultz <john.stultz@...aro.org>,
	Linux PM list <linux-pm@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Magnus Damm <magnus.damm@...il.com>, markgross@...gnar.org,
	Matthew Garrett <mjg@...hat.com>,
	Greg KH <gregkh@...uxfoundation.org>,
	Arve Hjønnevåg <arve@...roid.com>,
	Brian Swetland <swetland@...gle.com>,
	Neil Brown <neilb@...e.de>,
	Alan Stern <stern@...land.harvard.edu>,
	Dmitry Torokhov <dmitry.torokhov@...il.com>
Subject: Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2

On Saturday, February 25, 2012, Srivatsa S. Bhat wrote:
> On 02/25/2012 04:51 AM, Rafael J. Wysocki wrote:
> 
> > On Friday, February 24, 2012, Srivatsa S. Bhat wrote:
> >> On 02/24/2012 03:02 AM, Rafael J. Wysocki wrote:
> >>
> >>> On Thursday, February 23, 2012, Rafael J. Wysocki wrote:
> >>>> On Thursday, February 23, 2012, Srivatsa S. Bhat wrote:
> >>>>> On 02/23/2012 03:40 AM, Rafael J. Wysocki wrote:
> > [...]
> >>>>>
> >>>>> By the way, I am just curious.. how difficult will this make it for userspace
> >>>>> to disable autosleep? I mean, would a trylock mean that the user has to keep
> >>>>> fighting until he finally gets a chance to disable autosleep?
> >>>>
> >>>> That's a good point, so I think it may be a good idea to do
> >>>> mutex_lock_interruptible() in pm_autosleep_set_state() instead.
> >>>
> >>> Now that I think of it, perhaps it's a good idea to just make
> >>> pm_autosleep_lock() do mutex_lock_interruptible() _and_ make
> >>> pm_autosleep_set_state() use pm_autosleep_lock().
> >>>
> >>> What do you think?
> >>>
> >>
> >>
> >> Well, I don't think mutex_lock_interruptible() would help us much..
> >> Consider what would happen, if we use it:
> >>
> >> * pm-suspend got initiated as part of autosleep. Acquired autosleep lock.
> >> * Userspace is about to get frozen.
> >> * Now, the user tries to write "off" to autosleep. And hence, he is waiting
> >>   for autosleep lock, interruptibly.
> >> * The freezer sent a fake signal to all userspace processes and hence
> >>   this process also got interrupted.. it is no longer waiting on autosleep
> >>   lock - it got the signal and returned, and got frozen.
> >>   (And when the userspace gets thawed later, this process won't have the
> >>    autosleep lock - which is a different (but yet another) problem).
> >>
> >> So ultimately the only thing we achieved is to ensure that freezing of
> >> userspace goes smoothly. But the user process could not succeed in
> >> disabling autosleep. Of course we can work around that by having the
> >> mutex_lock_interruptible() in a loop and so on, but that gets very
> >> ugly pretty soon.
> >>
> >> So, I would suggest the following solution:
> >>
> >> We want to achieve 2 things here:
> >>  a. A user process trying to write to /sys/power/state or
> >>     /sys/power/autosleep should not cause freezing failures.
> >>  b. When a user process writes "off" to autosleep, the suspend/hibernate
> >>     attempt that is on-going, if any, must be immediately aborted, to give
> >>     the user the feeling that his preference has been noticed and respected.
> >>
> >> And to achieve this, we note that a user process can write "off" to autosleep
> >> only until the userspace gets frozen. No chance after that.
> >>
> >> So, let's do this:
> >> 1. Drop the autosleep lock before entering pm-suspend/hibernate.
> >> 2. This means, a user process can get hold of this lock and successfully
> >>    disable autosleep a moment after we initiated suspend, but before userspace
> >>    got frozen fully.
> >> 3. So, to respect the user's wish, we add a check immediately after the
> >>    freezing of userspace is complete - we check if the user disabled autosleep
> >>    and bail out, if he did. Otherwise, we continue and suspend the machine.
> >>
> >> IOW, this is like hitting 2 birds with one stone ;-)
> >> We don't hold autosleep lock throughout suspend/hibernate, but still react
> >> instantly when the user disables autosleep. And of course, freezing of tasks
> >> won't fail, ever! :-)
> > 
> > Well, you essentially are postulating to restore the "interface" wakeup source
> > that was present in the previous version of this patch and that I dropped in
> > order to simplify the code.
> > 
> 
> 
> Oh is it? I guess I haven't followed this thread very closely...
> 
> > I guess I can do that ...
> > 
> 
> 
> Oh by the way, this scheme doesn't solve all problems. It might be effective
> in reacting "instantly" to a request by the user to *switch off* autosleep.
> But say, when the user wants to switch to suspend instead of hibernate as the
> autosleep preference, for example, I don't think it would be as quick in
> responding... (I mean, it might do the old operation one more time before
> switching to the new one..)
> 
> But I guess at this point it might be wiser to say "sigh.. we can do only so
> much..." instead of complicating the code too much in an attempt to meet
> everybody's expectations :-)

I think we can do something like in the updated patch [5/7] below.

It uses a special wakeup source object called "autosleep" to bump up the
number of wakeup events in progress before acquiring autosleep_lock in
pm_autosleep_set_state().  This way, either pm_autosleep_set_state() will
acquire autosleep_lock before try_to_suspend(), in which case the latter
will see the change of autosleep_state immediately (after autosleep_lock has
been passed to it), or try_to_suspend() will get it first, but then
pm_save_wakeup_count() or pm_suspend()/hibernate() will see the nonzero counter
of wakeup events in progress and return error code (sooner or later).

The drawback is that writes to /sys/power/autosleep may interfere with
the /sys/power/wakeup_count + /sys/power/state interface by interrupting
transitions started by writing to /sys/power/state, for example (although
I think that's highly unlikely).

Additionally, I made pm_autosleep_lock() use mutex_trylock_interruptible()
to prevent operations on /sys/power/wakeup_count and/or /sys/power/state
from failing the freezing of tasks started by try_to_suspend().

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@...k.pl>
Subject: PM / Sleep: Implement opportunistic sleep

Introduce a mechanism by which the kernel can trigger global
transitions to a sleep state chosen by user space if there are no
active wakeup sources.

It consists of a new sysfs attribute, /sys/power/autosleep, that
can be written one of the strings returned by reads from
/sys/power/state, an ordered workqueue and a work item carrying out
the "suspend" operations.  If a string representing the system's
sleep state is written to /sys/power/autosleep, the work item
triggering transitions to that state is queued up and it requeues
itself after every execution until user space writes "off" to
/sys/power/autosleep.

That work item enables the detection of wakeup events using the
functions already defined in drivers/base/power/wakeup.c (with one
small modification) and calls either pm_suspend(), or hibernate() to
put the system into a sleep state.  If a wakeup event is reported
while the transition is in progress, it will abort the transition and
the "system suspend" work item will be queued up again.

Signed-off-by: Rafael J. Wysocki <rjw@...k.pl>
---
 Documentation/ABI/testing/sysfs-power |   17 ++++
 drivers/base/power/wakeup.c           |   38 ++++++-----
 include/linux/suspend.h               |   13 +++
 kernel/power/Kconfig                  |    8 ++
 kernel/power/Makefile                 |    1 
 kernel/power/autosleep.c              |  113 ++++++++++++++++++++++++++++++++
 kernel/power/main.c                   |  117 ++++++++++++++++++++++++++++------
 kernel/power/power.h                  |   18 +++++
 8 files changed, 290 insertions(+), 35 deletions(-)

Index: linux/kernel/power/Makefile
===================================================================
--- linux.orig/kernel/power/Makefile
+++ linux/kernel/power/Makefile
@@ -9,5 +9,6 @@ obj-$(CONFIG_SUSPEND)		+= suspend.o
 obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend_test.o
 obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
 				   block_io.o
+obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
 
 obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -103,6 +103,14 @@ config PM_SLEEP_SMP
 	select HOTPLUG
 	select HOTPLUG_CPU
 
+config PM_AUTOSLEEP
+	bool "Opportunistic sleep"
+	depends on PM_SLEEP
+	default n
+	---help---
+	Allow the kernel to trigger a system transition into a global sleep
+	state automatically whenever there are no active wakeup sources.
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/power.h
===================================================================
--- linux.orig/kernel/power/power.h
+++ linux/kernel/power/power.h
@@ -264,3 +264,21 @@ static inline void suspend_thaw_processe
 {
 }
 #endif
+
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+extern int pm_autosleep_init(void);
+extern int pm_autosleep_lock(void);
+extern void pm_autosleep_unlock(void);
+extern suspend_state_t pm_autosleep_state(void);
+extern int pm_autosleep_set_state(suspend_state_t state);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline int pm_autosleep_init(void) { return 0; }
+static inline int pm_autosleep_lock(void) { return 0; }
+static inline void pm_autosleep_unlock(void) {}
+static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
Index: linux/include/linux/suspend.h
===================================================================
--- linux.orig/include/linux/suspend.h
+++ linux/include/linux/suspend.h
@@ -356,7 +356,7 @@ extern int unregister_pm_notifier(struct
 extern bool events_check_enabled;
 
 extern bool pm_wakeup_pending(void);
-extern bool pm_get_wakeup_count(unsigned int *count);
+extern bool pm_get_wakeup_count(unsigned int *count, bool block);
 extern bool pm_save_wakeup_count(unsigned int count);
 
 static inline void lock_system_sleep(void)
@@ -407,6 +407,17 @@ static inline void unlock_system_sleep(v
 
 #endif /* !CONFIG_PM_SLEEP */
 
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+void queue_up_suspend_work(void);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline void queue_up_suspend_work(void) {}
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
+
 #ifdef CONFIG_ARCH_SAVE_PAGE_KEYS
 /*
  * The ARCH_SAVE_PAGE_KEYS functions can be used by an architecture
Index: linux/kernel/power/autosleep.c
===================================================================
--- /dev/null
+++ linux/kernel/power/autosleep.c
@@ -0,0 +1,113 @@
+/*
+ * kernel/power/autosleep.c
+ *
+ * Opportunistic sleep support.
+ *
+ * Copyright (C) 2012 Rafael J. Wysocki <rjw@...k.pl>
+ */
+
+#include <linux/device.h>
+#include <linux/mutex.h>
+#include <linux/pm_wakeup.h>
+
+#include "power.h"
+
+static suspend_state_t autosleep_state;
+static struct workqueue_struct *autosleep_wq;
+static DEFINE_MUTEX(autosleep_lock);
+static struct wakeup_source *autosleep_ws;
+
+static void try_to_suspend(struct work_struct *work)
+{
+	unsigned int initial_count, final_count;
+
+	if (!pm_get_wakeup_count(&initial_count, true))
+		goto out;
+
+	mutex_lock(&autosleep_lock);
+
+	if (!pm_save_wakeup_count(initial_count)) {
+		mutex_unlock(&autosleep_lock);
+		goto out;
+	}
+
+	if (autosleep_state == PM_SUSPEND_ON) {
+		mutex_unlock(&autosleep_lock);
+		return;
+	}
+	if (autosleep_state >= PM_SUSPEND_MAX)
+		hibernate();
+	else
+		pm_suspend(autosleep_state);
+
+	mutex_unlock(&autosleep_lock);
+
+	if (!pm_get_wakeup_count(&final_count, false))
+		goto out;
+
+	if (final_count == initial_count)
+		schedule_timeout(HZ / 2);
+
+ out:
+	queue_up_suspend_work();
+}
+
+static DECLARE_WORK(suspend_work, try_to_suspend);
+
+void queue_up_suspend_work(void)
+{
+	if (!work_pending(&suspend_work) && autosleep_state > PM_SUSPEND_ON)
+		queue_work(autosleep_wq, &suspend_work);
+}
+
+suspend_state_t pm_autosleep_state(void)
+{
+	return autosleep_state;
+}
+
+int pm_autosleep_lock(void)
+{
+	return mutex_lock_interruptible(&autosleep_lock);
+}
+
+void pm_autosleep_unlock(void)
+{
+	mutex_unlock(&autosleep_lock);
+}
+
+int pm_autosleep_set_state(suspend_state_t state)
+{
+
+#ifndef CONFIG_HIBERNATION
+	if (state >= PM_SUSPEND_MAX)
+		return -EINVAL;
+#endif
+
+	__pm_stay_awake(autosleep_ws);
+
+	mutex_lock(&autosleep_lock);
+
+	autosleep_state = state;
+
+	__pm_relax(autosleep_ws);
+
+	if (state > PM_SUSPEND_ON)
+		queue_up_suspend_work();
+
+	mutex_unlock(&autosleep_lock);
+	return 0;
+}
+
+int __init pm_autosleep_init(void)
+{
+	autosleep_ws = wakeup_source_register("autosleep");
+	if (!autosleep_ws)
+		return -ENOMEM;
+
+	autosleep_wq = alloc_ordered_workqueue("autosleep", 0);
+	if (autosleep_wq)
+		return 0;
+
+	wakeup_source_unregister(autosleep_ws);
+	return -ENOMEM;
+}
Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -269,8 +269,7 @@ static ssize_t state_show(struct kobject
 	return (s - buf);
 }
 
-static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
-			   const char *buf, size_t n)
+static suspend_state_t decode_state(const char *buf, size_t n)
 {
 #ifdef CONFIG_SUSPEND
 	suspend_state_t state = PM_SUSPEND_STANDBY;
@@ -278,27 +277,48 @@ static ssize_t state_store(struct kobjec
 #endif
 	char *p;
 	int len;
-	int error = -EINVAL;
 
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
-	/* First, check if we are requested to hibernate */
-	if (len == 4 && !strncmp(buf, "disk", len)) {
-		error = hibernate();
-		goto Exit;
-	}
+	/* Check hibernation first. */
+	if (len == 4 && !strncmp(buf, "disk", len))
+		return PM_SUSPEND_MAX;
 
 #ifdef CONFIG_SUSPEND
-	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
-		if (*s && len == strlen(*s) && !strncmp(buf, *s, len)) {
-			error = pm_suspend(state);
-			break;
-		}
-	}
+	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++)
+		if (*s && len == strlen(*s) && !strncmp(buf, *s, len))
+			return state;
 #endif
 
- Exit:
+	return PM_SUSPEND_ON;
+}
+
+static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
+			   const char *buf, size_t n)
+{
+	suspend_state_t state;
+	int error;
+
+	error = pm_autosleep_lock();
+	if (error)
+		return error;
+
+	if (pm_autosleep_state() > PM_SUSPEND_ON) {
+		error = -EBUSY;
+		goto out;
+	}
+
+	state = decode_state(buf, n);
+	if (state < PM_SUSPEND_MAX)
+		error = pm_suspend(state);
+	else if (state > PM_SUSPEND_ON)
+		error = hibernate();
+	else
+		error = -EINVAL;
+
+ out:
+	pm_autosleep_unlock();
 	return error ? error : n;
 }
 
@@ -339,7 +359,8 @@ static ssize_t wakeup_count_show(struct
 {
 	unsigned int val;
 
-	return pm_get_wakeup_count(&val) ? sprintf(buf, "%u\n", val) : -EINTR;
+	return pm_get_wakeup_count(&val, true) ?
+		sprintf(buf, "%u\n", val) : -EINTR;
 }
 
 static ssize_t wakeup_count_store(struct kobject *kobj,
@@ -347,15 +368,69 @@ static ssize_t wakeup_count_store(struct
 				const char *buf, size_t n)
 {
 	unsigned int val;
+	int error;
+
+	error = pm_autosleep_lock();
+	if (error)
+		return error;
+
+	if (pm_autosleep_state() > PM_SUSPEND_ON) {
+		error = -EBUSY;
+		goto out;
+	}
 
 	if (sscanf(buf, "%u", &val) == 1) {
 		if (pm_save_wakeup_count(val))
 			return n;
 	}
-	return -EINVAL;
+	error = -EINVAL;
+
+ out:
+	pm_autosleep_unlock();
+	return error;
 }
 
 power_attr(wakeup_count);
+
+#ifdef CONFIG_PM_AUTOSLEEP
+static ssize_t autosleep_show(struct kobject *kobj,
+			      struct kobj_attribute *attr,
+			      char *buf)
+{
+	suspend_state_t state = pm_autosleep_state();
+
+	if (state == PM_SUSPEND_ON)
+		return sprintf(buf, "off\n");
+
+#ifdef CONFIG_SUSPEND
+	if (state < PM_SUSPEND_MAX)
+		return sprintf(buf, "%s\n", valid_state(state) ?
+						pm_states[state] : "error");
+#endif
+#ifdef CONFIG_HIBERNATION
+	return sprintf(buf, "disk\n");
+#else
+	return sprintf(buf, "error");
+#endif
+}
+
+static ssize_t autosleep_store(struct kobject *kobj,
+			       struct kobj_attribute *attr,
+			       const char *buf, size_t n)
+{
+	suspend_state_t state = decode_state(buf, n);
+	int error;
+
+	if (state == PM_SUSPEND_ON && strncmp(buf, "off", 3)
+	    && strncmp(buf, "off\n", 4))
+		return -EINVAL;
+
+	error = pm_autosleep_set_state(state);
+	return error ? error : n;
+}
+
+power_attr(autosleep);
+#endif /* CONFIG_PM_AUTOSLEEP */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_TRACE
@@ -409,6 +484,9 @@ static struct attribute * g[] = {
 #ifdef CONFIG_PM_SLEEP
 	&pm_async_attr.attr,
 	&wakeup_count_attr.attr,
+#ifdef CONFIG_PM_AUTOSLEEP
+	&autosleep_attr.attr,
+#endif
 #ifdef CONFIG_PM_DEBUG
 	&pm_test_attr.attr,
 #endif
@@ -444,7 +522,10 @@ static int __init pm_init(void)
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
-	return sysfs_create_group(power_kobj, &attr_group);
+	error = sysfs_create_group(power_kobj, &attr_group);
+	if (error)
+		return error;
+	return pm_autosleep_init();
 }
 
 core_initcall(pm_init);
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -492,8 +492,10 @@ static void wakeup_source_deactivate(str
 	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
 
 	split_counters(&cnt, &inpr);
-	if (!inpr && waitqueue_active(&wakeup_count_wait_queue))
+	if (!inpr && waitqueue_active(&wakeup_count_wait_queue)) {
 		wake_up(&wakeup_count_wait_queue);
+		queue_up_suspend_work();
+	}
 }
 
 /**
@@ -654,29 +656,33 @@ bool pm_wakeup_pending(void)
 /**
  * pm_get_wakeup_count - Read the number of registered wakeup events.
  * @count: Address to store the value at.
+ * @block: Whether or not to block.
  *
- * Store the number of registered wakeup events at the address in @count.  Block
- * if the current number of wakeup events being processed is nonzero.
+ * Store the number of registered wakeup events at the address in @count.  If
+ * @block is set, block until the current number of wakeup events being
+ * processed is zero.
  *
- * Return 'false' if the wait for the number of wakeup events being processed to
- * drop down to zero has been interrupted by a signal (and the current number
- * of wakeup events being processed is still nonzero).  Otherwise return 'true'.
+ * Return 'false' if the current number of wakeup events being processed is
+ * nonzero.  Otherwise return 'true'.
  */
-bool pm_get_wakeup_count(unsigned int *count)
+bool pm_get_wakeup_count(unsigned int *count, bool block)
 {
 	unsigned int cnt, inpr;
-	DEFINE_WAIT(wait);
 
-	for (;;) {
-		prepare_to_wait(&wakeup_count_wait_queue, &wait,
-				TASK_INTERRUPTIBLE);
-		split_counters(&cnt, &inpr);
-		if (inpr == 0 || signal_pending(current))
-			break;
+	if (block) {
+		DEFINE_WAIT(wait);
 
-		schedule();
+		for (;;) {
+			prepare_to_wait(&wakeup_count_wait_queue, &wait,
+					TASK_INTERRUPTIBLE);
+			split_counters(&cnt, &inpr);
+			if (inpr == 0 || signal_pending(current))
+				break;
+
+			schedule();
+		}
+		finish_wait(&wakeup_count_wait_queue, &wait);
 	}
-	finish_wait(&wakeup_count_wait_queue, &wait);
 
 	split_counters(&cnt, &inpr);
 	*count = cnt;
Index: linux/Documentation/ABI/testing/sysfs-power
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-power
+++ linux/Documentation/ABI/testing/sysfs-power
@@ -172,3 +172,20 @@ Description:
 
 		Reading from this file will display the current value, which is
 		set to 1 MB by default.
+
+What:		/sys/power/autosleep
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@...k.pl>
+Description:
+		The /sys/power/autosleep file can be written one of the strings
+		returned by reads from /sys/power/state.  If that happens, a
+		work item attempting to trigger a transition of the system to
+		the sleep state represented by that string is queued up.  This
+		attempt will only succeed if there are no active wakeup sources
+		in the system at that time.  After evey execution, regardless
+		of whether or not the attempt to put the system to sleep has
+		succeeded, the work item requeues itself until user space
+		writes "off" to /sys/power/autosleep.
+
+		Reading from this file causes the last string successfully
+		written to it to be displayed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ