[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4D0E9F20.6080606@sssup.it>
Date: Mon, 20 Dec 2010 01:11:12 +0100
From: Tommaso Cucinotta <tommaso.cucinotta@...up.it>
To: Harald Gustafsson <hgu1972@...il.com>
CC: Peter Zijlstra <peterz@...radead.org>,
Dario Faggioli <raistlin@...ux.it>,
Harald Gustafsson <harald.gustafsson@...csson.com>,
linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
Thomas Gleixner <tglx@...utronix.de>,
Claudio Scordino <claudio@...dence.eu.com>,
Michael Trimarchi <trimarchi@...is.sssup.it>,
Fabio Checconi <fabio@...dalf.sssup.it>,
Juri Lelli <juri.lelli@...il.com>
Subject: Re: [PATCH 1/3] Added runqueue clock normalized with cpufreq
Il 17/12/2010 20:31, Harald Gustafsson ha scritto:
>>> We already did the very same thing (for another EU Project called
>>> FRESCOR), although it was done in an userspace sort of daemon. It was
>>> also able to consider other "high level" parameters like some estimation
>>> of the QoS of each application and of the global QoS of the system.
>>>
>>> However, converting the basic mechanism into a CPUfreq governor should
>>> be easily doable... The only problem is finding the time for that! ;-P
>> Ah, I think Harald will solve that for you,.. :)
> Yes, I don't mind doing that. Could you point me to the right part of
> the FRESCOR code, Dario?
Hi there,
I'm sorry to join so late this discussion, but the unprecedented 20cm of
snow in Pisa had some non-negligible drawbacks on my return flight from
Perth :-).
Let me try to briefly recap what the outcomes of FRESCOR were, w.r.t.
power management (but usually I'm not that brief :-) ):
1. from a requirements analysis phase, it comes out that it should be
possible to specify the individual runtimes for each possible frequency,
as it is well-known that the way computation times scale to CPU
frequency is application-dependent (and platform-dependent); this
assumes that as a developer I can specify the possible configurations of
my real-time app, then the OS will be free to pick the CPU frequency
that best suites its power management logic (i.e., keeping the minimum
frequency by which I can meet all the deadlines).
Requirements Analysis:
http://www.frescor.org/index.php?mact=Uploads,cntnt01,getfile,0&cntnt01showtemplate=false&cntnt01upload_id=62&cntnt01returnid=54
Proposed API:
http://www.frescor.org/index.php?mact=Uploads,cntnt01,getfile,0&cntnt01showtemplate=false&cntnt01upload_id=105&cntnt01returnid=54
I also attach the API we implemented, however consider it is a mix of
calls for doing both what I wrote above, and building an OS-independent
abstraction layer for dealing with CPU frequency scaling (and not only)
on the heterogeneous OSes we had in FRESCOR;
2. this was also assuming, at an API level, a quite static settings
(typical of hard RT), in which I configure the system and don't change
its frequency too often; for example, implications of power switches on
hard real-time requirements (i.e., time windows in which the CPU is not
operating during the switch, and limits on the max sustainable switching
frequencies by apps and the like) have not been stated through the API;
3. for soft real-time contexts and Linux (consider FRESCOR targeted both
hard RT on RT OSes and soft RT on Linux), we played with a much simpler
trivial linear scaling, which is exactly what has been proposed and
implemented by someone in this thread on top of SCHED_DEADLINE (AFAIU);
however, there's a trick which cannot be neglected, i.e., *change
protocol* (see 5); benchmarks on MPEG-2 decoding times showed that the
linear approximation is not that bad, but the best interpolating ratio
between the computing times in different CPU frequencies do not
perfectly conform to the frequencies ratios; we didn't make any attempt
of extensive evaluation over different workloads so far. See Figure 4.1
in D-AQ2v2:
http://www.frescor.org/index.php?mact=Uploads,cntnt01,getfile,0&cntnt01showtemplate=false&cntnt01upload_id=82&cntnt01returnid=54
4. I would say that, given the tendency to over-provision the runtime
(WCET) for hard real-time contexts, it would not bee too much of a
burden for a hard RT developer to properly over-provision the required
budget in presence of a trivial runtime rescaling policy like in 2.;
however, in order to make everybody happy, it doesn't seem a bad idea to
have something like:
4a) use the fine runtimes specified by the user if they are available;
4b) use the trivially rescaled runtimes if the user only specified a
single runtime, of course it should be clear through the API what is the
frequency the user is referring its runtime to, in such case (e.g.,
maximum one ?)
5. Mode Change Protocol: whenever a frequency switch occurs (e.g.,
dictated by the non-RT workload fluctuations), runtimes cannot simply be
rescaled instantaneously: keeping it short, the simplest thing we can do
is relying on the various CBS servers implemented in the scheduler to
apply the change from the next "runtime recharge", i.e., the next
period. This creates the potential problem that the RT tasks have a
non-negligible transitory for the instances crossing the CPU frequency
switch, in which they do not have enough runtime for their work. Now,
the general "rule of thumb" is straightforward: make room first, then
"pack", i.e., we need to consider 2 distinct cases:
5a) we want to *increase the CPU frequency*; we can immediately
increase the frequency, then the RT applications will have a temporary
over-provisioning of runtime (still tuned for the slower frequency
case), however as soon as we're sure the CPU frequency switch completed,
we can lower the runtimes to the new values;
5b) we want to *decrease the CPU frequency*; unfortunately, here we
need to proceed in the other way round: first, we need to increase the
runtimes of the RT applications to the new values, then, as soon as
we're sure all the scheduling servers made the change (waiting at most
for a time equal to the maximum configured RT period), then we can
actually perform the frequency switch. Of course, before switching the
frequency, there's an assumption: that the new runtimes after the freq
decrease are still schedulable, so the CPU freq switching logic needs to
be aware of the allocated RT reservations.
The protocol in 5. has been implemented completely in user-space as a
modification to the powernowd daemon, in the context of an extended
version of a paper in which we were automagically guessing the whole set
of scheduling parameters for periodic RT applications (EuroSys 2010).
The modified powernowd was considering both the whole RT utilization as
imposed by the RT reservations, and the non-RT utilization as measured
on the CPU. The paper will appear on ACM TECS, but who knows when, so
here u can find it (see Section 7.5 "Power Management"):
http://retis.sssup.it/~tommaso/publications/ACM-TECS-2010.pdf
(last remark: no attempt to deal with multi-cores and their various
power switching capabilities, on this paper . . .)
Last, but not least, the whole point in the above discussion is the
assumption that it is meaningful to have a CPU frequency switching
policy, as opposed to merely CPU idle-ing. Perhaps on old embedded CPUs
this is still the case. Unfortunately, from preliminary measurements
made on a few systems I use every day through a cheap power measurement
device attached on the power cable, I could actually see that for RT
workloads only it is worth to leave the system at the maximum frequency
and exploit the much higher time spent in idle mode(s), except when the
system is completely idle.
If you're interested, I can share the collected data sets.
Bye (and apologies for the length).
T.
--
Tommaso Cucinotta, Computer Engineering PhD, Researcher
ReTiS Lab, Scuola Superiore Sant'Anna, Pisa, Italy
Tel +39 050 882 024, Fax +39 050 882 003
http://retis.sssup.it/people/tommaso
View attachment "frsh_energy_management.h" of type "text/x-chdr" (11690 bytes)
Powered by blists - more mailing lists