linux-kernel - Re: [RFC 0/2] CPU frequency scaled from a task's load on an idle wakeup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54608321.5080602@linux.vnet.ibm.com>
Date:	Mon, 10 Nov 2014 14:49:29 +0530
From:	Shilpasri G Bhat <shilpa.bhat@...ux.vnet.ibm.com>
To:	linux-kernel@...r.kernel.org
CC:	linux-pm@...r.kernel.org, mturquette@...aro.org,
	amit.kucheria@...aro.org, vincent.guittot@...aro.org,
	daniel.lezcano@...aro.org, Morten.Rasmussen@....com, efault@....de,
	nicolas.pitre@...aro.org, dietmar.eggemann@....com, pjt@...gle.com,
	bsegall@...gle.com, peterz@...radead.org, mingo@...nel.org,
	linaro-kernel@...ts.linaro.org,
	Preeti U Murthy <preeti@...ux.vnet.ibm.com>
Subject: Re: [RFC 0/2] CPU frequency scaled from a task's load on an idle
 wakeup

Experimental Results:

Tested on a powerpc machine with 16 cores and obtained following results with
patchset.

I ran a modified version of ebizzy called sleeping-ebizzy which runs ebizzy at
various levels of utilization. The following results were found by running
ebizzy with 1 thread for 30s.

  Utilization(%)	Difference(%) in records/s with patch
  --------------        -------------------------------------
	10			-0.5516335445
	20			+0.0196049675
	30			+0.2222333684
	40			+0.3205441843
	50			-0.0103332452
	60			-0.3525380134
	70			+0.428654342
	80			+0.1527132862
	90			+0.0758061406

Thanks and Regards,
Shilpa

On 11/10/2014 11:15 AM, Shilpasri G Bhat wrote:
> This patch set aims to solve a problem in cpufreq governor's CPU
> load calculation logic when the CPU wakes up after an idle period.
> In the current logic when a CPU wakes up from an idle state the
> 'previous load' of the CPU is used as its current load on the
> alternate wakeups.
> 
> A latency-sensitive-bursty task will be benefited from this logic if
> it wakes up on a CPU on which it was initially running, with a
> non-compromised CPU 'previous load' i.e, the 'previous load' holds
> the last calculated CPU load before the task went to sleep. In such
> a case, the cpufreq governor will account to high previous CPU load
> and decides to run at high frequency.
> 
> The problem in this logic is that the 'previous load' which is meant
> to help certain latency-sensitive-bursty tasks can get used by some
> periodic-small tasks(like kernel daemons) to its advantage if the
> small task woke up first on the CPU. This will deprive the the
> latency-sensitive-bursty tasks from running at high frequency until
> the cpufreq governor notices the 100% CPU utilization. If this pattern
> gets repeated in the due course of bursty task's execution we will
> land on the same problem which 'prev_load' had originally set forth to 
> solve.
> 
> Probably we could reduce these inefficiencies if the cpufreq
> governor was aware of the task's nature, while calculating the load
> during an idle-wakeup scenario. So instead of using the previous
> load for the CPU , the load can be deduced on the basis of incoming
> task's load.
> 
> In this patch we use a metric built on top of 'load_avg_contrib'.
> 'load_avg_contrib' of a task's sched entity can describe the nature
> of the task in terms of its CPU utilization. The properties of this
> metric to encapsulate the CPU utilization of a task makes it a
> potential candidate for scaling CPU frequency. However, due to the
> nature of its design 'load_avg_contrib' cannot pick up the task's
> load rapidly after a wakeup. As we are trying to solve the problem
> on idle-wakeup case we cannot use this metric value as is to scale
> the frequency. So we measure the cumulative moving average of
> 'load_avg_contrib'.
> 
> The cumulative average of 'load_avg_contrib' at a given point is the
> average of all the values of 'load_avg_contrib' up until that point.
> The current average of a new 'load_avg_contrib' value is as below:
> 
> Cumulative_average(n+1) = x(n+1) + Cumulative_average(n) * n
>                         ---------------------------------------
>                                         n+1
> where,
> Cumulative_average(n+1) is the current cumulative average
> x(n+1) is the latest 'load_avg_contrib' value
> Cumulative_average(n) is the previous cumulative average
> n+1 is the number of 'load_avg_contrib' values so far
> 
> The cumulative average of 'load_avg_contrib' will help us smooth out
> the short-term fluctuations and highlight long-term trend of
> 'load-avg_contrib' metric. So cumulative average of the task can
> depict the nature of the task more effectively. Thus we can scale CPU
> frequency based on the cumulative average of the task and make
> calculative decisions whether to decrease or increase the frequency
> depending on the nature of the task.
> 
> Shilpasri G Bhat (2):
>   sched/fair: Add cumulative average of load_avg_contrib to a task
>   cpufreq: governor: CPU frequency scaled from task's cumulative-load on
>     an idle wakeup
> 
>  drivers/cpufreq/cpufreq_governor.c | 39 +++++++++++++++-----------------------
>  drivers/cpufreq/cpufreq_governor.h |  9 ++-------
>  include/linux/sched.h              |  4 ++++
>  kernel/sched/core.c                | 35 ++++++++++++++++++++++++++++++++++
>  kernel/sched/fair.c                |  6 +++++-
>  kernel/sched/sched.h               |  2 +-
>  6 files changed, 62 insertions(+), 33 deletions(-)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/