[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <13348109.c4H00groOp@vostro.rjw.lan>
Date: Fri, 25 Apr 2014 14:19:46 +0200
From: "Rafael J. Wysocki" <rjw@...ysocki.net>
To: Morten Rasmussen <morten.rasmussen@....com>
Cc: Yuyang Du <yuyang.du@...el.com>,
"mingo@...hat.com" <mingo@...hat.com>,
"peterz@...radead.org" <peterz@...radead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
"arjan.van.de.ven@...el.com" <arjan.van.de.ven@...el.com>,
"len.brown@...el.com" <len.brown@...el.com>,
"rafael.j.wysocki@...el.com" <rafael.j.wysocki@...el.com>,
"alan.cox@...el.com" <alan.cox@...el.com>,
"mark.gross@...el.com" <mark.gross@...el.com>,
"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>
Subject: Re: [RFC] A new CPU load metric for power-efficient scheduler: CPU ConCurrency
On Friday, April 25, 2014 11:23:07 AM Morten Rasmussen wrote:
> Hi Yuyang,
>
> On Thu, Apr 24, 2014 at 08:30:05PM +0100, Yuyang Du wrote:
> > 1) Divide continuous time into periods of time, and average task concurrency
> > in period, for tolerating the transient bursts:
> > a = sum(concurrency * time) / period
> > 2) Exponentially decay past periods, and synthesize them all, for hysteresis
> > to load drops or resilience to load rises (let f be decaying factor, and a_x
> > the xth period average since period 0):
> > s = a_n + f^1 * a_n-1 + f^2 * a_n-2 +, …..,+ f^(n-1) * a_1 + f^n * a_0
> >
> > We name this load indicator as CPU ConCurrency (CC): task concurrency
> > determines how many CPUs are needed to be running concurrently.
> >
> > To track CC, we intercept the scheduler in 1) enqueue, 2) dequeue, 3)
> > scheduler tick, and 4) enter/exit idle.
> >
> > By CC, we implemented a Workload Consolidation patch on two Intel mobile
> > platforms (a quad-core composed of two dual-core modules): contain load and load
> > balancing in the first dual-core when aggregated CC low, and if not in the
> > full quad-core. Results show that we got power savings and no substantial
> > performance regression (even gains for some).
>
> The idea you present seems quite similar to the task packing proposals
> by Vincent and others that were discussed about a year ago. One of the
> main issues related to task packing/consolidation is that it is not
> always beneficial.
>
> I have spent some time over the last couple of weeks looking into this
> trying to figure out when task consolidation makes sense. The pattern I
> have seen is that it makes most sense when the task energy is dominated
> by wake-up costs. That is short-running tasks. The actual energy savings
> come from a reduced number of wake-ups if the consolidation cpu is busy
> enough to be already awake when another task wakes up, and savings by
> keeping the consolidation cpu in a shallower idle state and thereby
> reducing the wake-up costs. The wake-up cost savings outweighs the
> additional leakage in the shallower idle state in some scenarios. All of
> this is of course quite platform dependent. Different idle state leakage
> power and wake-up costs may change the picture.
The problem, however, is that it usually is not really known in advance
whether or not a given task will be short-running. There simply is no way
to tell.
The only kinds of information we can possibly use to base decisions on are
(1) things that don't change (or if they change, we know exactly when and
how), such as the system's topology, and (2) information on what happened
in the past. So, for example, if there's a task that has been running for
some time already and it has behaved in approximately the same way all the
time, it is reasonable to assume that it will behave in this way in the
future. We need to let it run for a while to collect that information,
though.
Without that kind of information we can only speculate about what's going
to happen and different methods of speculation may lead to better or worse
results in a given situation, but still that's only speculation and the
results are only known after the fact.
In the reverse, if I know the system topology and I have a particular workload,
I know what's going to happen, so I can find a load balancing method that will
be perfect for this particular workload on this particular system. That's not
the situation the scheduler has to deal with, though, because the workload is
unknown to it until it has been measured.
So in my opinion we need to figure out how to measure workloads while they are
running and then use that information to make load balancing decisions.
In principle, given the system's topology, task packing may lead to better
results for some workloads, but not necessarily for all of them. So we need
a way to determine (a) whether or not task packing is an option at all in the
given system (that may change over time due to user policy changes etc.) and
if that is the case, then (b) if the current workload is eligible for task
packing.
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists