[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090513130541.21440.33364.stgit@drishya.in.ibm.com>
Date: Wed, 13 May 2009 18:41:00 +0530
From: Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>
To: Linux Kernel <linux-kernel@...r.kernel.org>,
Suresh B Siddha <suresh.b.siddha@...el.com>,
Venkatesh Pallipadi <venkatesh.pallipadi@...el.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Arjan van de Ven <arjan@...radead.org>
Cc: Ingo Molnar <mingo@...e.hu>, Dipankar Sarma <dipankar@...ibm.com>,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
Vatsa <vatsa@...ux.vnet.ibm.com>,
Gautham R Shenoy <ego@...ibm.com>,
Andi Kleen <andi@...stfloor.org>,
Gregory Haskins <gregory.haskins@...il.com>,
Mike Galbraith <efault@....de>,
Thomas Gleixner <tglx@...utronix.de>,
Arun Bharadwaj <arun@...ux.vnet.ibm.com>,
Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>
Subject: [RFC PATCH v2 0/2] Saving power by cpu evacuation
sched_max_capacity_pct=n
Hi,
The idea of extending sched_mc_powersavings tunable for cpu evacuation
was discussed at http://lwn.net/Articles/330309/
The summary of the discussion is as follows:
* Using sched_mc=3,4,5 to evacuate 1,2,4 cores is completely
non-intuitive and broken interface. Ingo wanted to see if we can
model a global percentile tunable that would map to core throttling.
* Peter Zijlstra wanted more justifications for throttling at the core
level. Throttling may be a resource management problem rather than
scheduler/load balancer
* CPU hotplug and cpuset/cgroup based cpu throttling are viable
alternatives to this approach.
Changes in v2:
* Created a percentage knob sched_max_capacity_pct=n
Defaults to 100, can be set to 75 or 50 to evacuate cores
* This patch is still a hack for discussion and has many
limitations.
v1: http://lkml.org/lkml/2009/4/26/202
Into and parts from previous post for quick reference:
------------------------------------------------------
Objective:
----------
* Framework to evacuate tasks from cpus in order to force the cpu
cores to stay at idle. Forcefully idling cores and packages can
reduce power consumption.
* Fast response time and low OS overhead to moved tasks away from
selected cpu packages. CPU hotplug is too heavyweight for this
purpose
Use cases:
---------
* Ability to throttle the number of cores used in the system along
with other power saving controls like cpufreq governors can enable
the system to operate at a more power efficient operating point and
still meet the design objectives.
* Facilitate thermal management by evacuating cores from hot cpu
packages
Alternatives:
-------------
* CPU hotplug: Heavy weight and slow. Setting up and tear down of
data structures involved. May need new fast or light weight
notifications
* CPUSets: Exclusive CPU sets and partitioned sched domains involve
rebuilding sched domains and relatively heavy weight for the purpose
The following patch is against 2.6.30-rc5 and will work only in an
under utilised system (No of tasks <= number of cores).
Test results for ebizzy 8 threads at various sched_max_capacity_pct
settings. The test platform is dual socket quad core x86 system
(pre-Nehalem).
This is an interesting characteristics of the ebizzy benchmark where
the following command line improved in performance as we evacuated
cores! Perhaps cross-cache traffic... I will verify that next time.
ebizzy -s 4096 -t 8 -S 30
sched_mc_power_savings was set to 2 in the experiment
-----------------------------------------------------------------
sched_max_capacity_pct No Cores Performance AvgPower
used Records/sec (Watts)
-----------------------------------------------------------------
100 8 1.00x 1.00y
87 7 1.03x 0.98y
75 6 1.06x 0.95y
62 5 1.26x 0.91y
50 4 1.15x 0.86y
-----------------------------------------------------------------
There were wide run variation with ebizzy. The purpose of the above
data is to justify use of core evacuation for power vs performance
trade-offs. The patch does not yet work for kernbench and other
complex workloads/benchmarks. I even tried SPECjbb and did not get the
expected CPU utilisation at various settings to reduce power
consumption. The utilisation/power was much lower than expected.
ToDo:
-----
* Identify good benchmark to demonstrate benefits of cpu evacuation
* Make the core evacuation predictable under different system load
conditions and workload characteristics. This is turning out to be
a major challenge in this approach.
* Enhance framework to control which particular packages/cores will be
evacuated, this is needed for thermal management. The
CPU hotplug/cpuset approach will solve this problem.
I can experiment with different benchmarks/platforms and post results
while the framework is being discussed.
Please let me know you comments and suggestions.
Thanks,
Vaidy
---
Vaidyanathan Srinivasan (2):
sched: loadbalancer hacks for forced packing of tasks
sched: add sched_max_capacity_pct
kernel/sched.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 64 insertions(+), 1 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists