linux-kernel - Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100528083934.5f2f2cb6@schatten.dmk.lab>
Date:	Fri, 28 May 2010 08:39:34 +0200
From:	Florian Mickler <florian@...kler.org>
To:	Brian Swetland <swetland@...gle.com>
Cc:	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Matthew Garrett <mjg59@...f.ucam.org>,
	Alan Stern <stern@...land.harvard.edu>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	LKML <linux-kernel@...r.kernel.org>, felipe.balbi@...ia.com,
	Linux OMAP Mailing List <linux-omap@...r.kernel.org>,
	Linux PM <linux-pm@...ts.linux-foundation.org>
Subject: Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

On Thu, 27 May 2010 21:55:26 -0700
Brian Swetland <swetland@...gle.com> wrote:

> On Thu, May 27, 2010 at 3:55 PM, Alan Cox <alan@...rguk.ukuu.org.uk> wrote:
> >
> > This started because the Android people came to a meeting that was put
> > together of various folks to try and sort of the big blockage in getting
> > Android and Linux kernels back towards merging.
> >
> > I am interested right now in finding a general solution to the Android
> > case and the fact it looks very similar to the VM, hard RT, gamer and
> > other related problems although we seem to have diverged from that logic.
> 
> I think that the suspend block model can be viewed as a constraints
> problem (similar to some of things things you've been sketching out in
> these threads), but I think we (Google/Android) view it as more of a
> state constraint (don't enter suspend) than a latency constraint.
> 
> We think there's a need for these constraints both from the driver
> side and userspace side, and that these constraints are not tied to
> processes (multiple entities in one process may have different
> constraints at different times or multiple processes may be working
> together to accomplish some goal under a single constraint -- at least
> both cases exist in the Android system as it ships today).
> 
> The exact naming of the API is not terribly important to us.  The
> first thing we spent a bunch of time discussing last summer when Arve
> first looked into sending wakelocks upstream was changing the name
> because many objected to "wakelock" for various reasons.
> 
> Being able to have userful statistics (which drivers/processes/etc
> held which wakelock for how long, how many times, etc) is important to
> us.  While we want to do the best we can in the face of poorly written
> apps, we also want to educate users and developers about which apps
> are contributing to their poor battery life -- so users can decide to
> uninstall an app if its usefulness does not justify its impact on
> battery life and application developers can be more aware of what the
> cost of their app is to endusers.
> 
> As an example, http://frotz.net/misc/battery-stats-unplugged.txt
> contains a dump from the "battery service" aggregating wakelock usage,
> cpu usage, and sensor device usage of processes (#....: sections) on
> my phone the other day for a ~3 hour period.  This data is presented
> visually to the enduser in a "what's using my battery" feature of the
> platform.  "realtime" refers to wall clock time here and "uptime"
> refers to not-in-suspend execution time.
> 
> Brian


Hi!
Thinking about the issue a little more, this isn't really about trusted
apps and not trusted apps. Or crapplications. 

The point is, that as soon as an app takes a suspend-blocker it becomes
 what is here referred to as a "trusted app". But just because it is then visible as
consuming power in an official way. 

Android suspends (as in echo mem > /sys/power/state)
whenever possible. It's as if there were a spring on the laptop lid,
and if the user doesnt hold his grip on it, the thing closes. How does
he hold his grip? The application registers a suspend-blocker for him.

So, why not use something like idle/QOS with this? 

I can imagine to theoretically have a "latency requirement" where 0
means this application does not interact with the user. and != 0 means
this application interacts with the user.

("latency requirement" doesn't quite get it, but it works for now)

In android land, the default would be that every application has a
latency-requirement of 0. And then everything (userland) that takes a
suspend-blocker would be changed to take a "latency requirement != 0". 

Now, if the system interacts with the user
( i.e. there is a global
latency requirement > 0, where "global latency requirement" is
computed by the pm framework maxing over all the userland processes
and the kernel side)
everything has to run. So we also need to schedule things which specify 
a latency requirement == 0.

This last thing means, that it has to be independent of the scheduler, doesn't it?

I don't see how renaming suspend_blocker to set_pidle would not do
something equivalent to this, but the bit's are probably a bit scattered
throughout the kernel. 
(Which I don't think is introduced by that patch set, but by the fact that 
suspend is currently not an idle state.)

I can understand if there needs to be a good solution in the kernel
from day 1. 

So, what would compose to a good solution? 

Here should probably the more experienced people jump in, but let me express 
what i've gathered in this discussion (especially from Thomas and Alan Cox):

1. change suspend framework to be "just another idle state"
2. specify that "just another idle state" can only be entered if
"global latency requirement" == 0
3. probably add some cost-estimate-computation to the "just another
idle state"

(the trick here is, that this idle-state ignores all current measures of "idle", 
so the cost for this would only depend on the cost-estimate to enter it and 
the suspend-power-usage. which also means it is probably 'opportune' to enter it, whenever possible, 
except the machine is idle the old way already (because the cost to enter is bigger))

4. change the userspace suspend interface 
	i.e. echo mem > /sys/power/state to override the "global
	latency requirement" to be 0.

5. convert the drivers to relax their latency-requirement to be 0
whenever possible. (in android land, this is already done, probably just needs a 
s/suspend_block/set_pidle(1)/ )
6. enhance the cpufreq drivers to take global latency requirement into
view. (i.e. opportunistic suspend would be implemented in the proper place,
 don't know which that is, please chime in)

So, what specifically would have to be done to the suspend blockers patches?
And can it be done incrementally? (I guess the answer is no, we don't want this done 
in the kernel , we want it done right?)

Cheers,
Flo



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/