linux-kernel - Re: [discussion]sched: a rough proposal to enable power saving in scheduler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <502C676A.7050001@intel.com>
Date:	Thu, 16 Aug 2012 11:22:18 +0800
From:	Alex Shi <alex.shi@...el.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
CC:	Borislav Petkov <bp@...en8.de>,
	Suresh Siddha <suresh.b.siddha@...el.com>,
	Arjan van de Ven <arjan@...ux.intel.com>,
	vincent.guittot@...aro.org, svaidy@...ux.vnet.ibm.com,
	Ingo Molnar <mingo@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Paul Turner <pjt@...gle.com>
Subject: Re: [discussion]sched: a rough proposal to enable power saving in
 scheduler

On 08/15/2012 10:43 PM, Peter Zijlstra wrote:

> On Wed, 2012-08-15 at 15:15 +0200, Borislav Petkov wrote:
>> On Wed, Aug 15, 2012 at 01:05:38PM +0200, Peter Zijlstra wrote:
>>> On Mon, 2012-08-13 at 20:21 +0800, Alex Shi wrote:
>>>> Since there is no power saving consideration in scheduler CFS, I has a
>>>> very rough idea for enabling a new power saving schema in CFS.
>>>
>>> Adding Thomas, he always delights poking holes in power schemes.
>>>
>>>> It bases on the following assumption:
>>>> 1, If there are many task crowd in system, just let few domain cpus
>>>> running and let other cpus idle can not save power. Let all cpu take the
>>>> load, finish tasks early, and then get into idle. will save more power
>>>> and have better user experience.
>>>
>>> I'm not sure this is a valid assumption. I've had it explained to me by
>>> various people that race-to-idle isn't always the best thing. It has to
>>> do with the cost of switching power states and the duration of execution
>>> and other such things.
>>
>> I think what he means here is that we might want to let all cores on
>> the node (i.e., domain) finish and then power down the whole node which
>> should bring much more power savings than letting a subset of the cores
>> idle. Alex?
> 
> Sure we can do that.
> 
>>> So I'd leave the currently implemented scheme as performance, and I
>>> don't think the above describes the current state.
>>>
>>>> 			} else if (schedule policy == power)
>>>> 				move tasks from busiest group to
>>>> 				idlest group until busiest is just full
>>>> 				of capacity.
>>>> 				//the busiest group can balance
>>>> 				//internally after next time LB,
>>>
>>> There's another thing we need to do, and that is collect tasks in a
>>> minimal amount of power domains.
>>
>> Yep.
>>
>> Btw, what heuristic would tell here when a domain overflows and another
>> needs to get woken? Combined load of the whole domain?
>>
>> And if I absolutely positively don't want a node to wake up, do I
>> hotplug its cores off or are we going to have a way to tell the
>> scheduler to overcommit the non-idle domains and spread the tasks only
>> among them.
>>
>> I'm thinking of short bursts here where it would be probably beneficial
>> to let the tasks rather wait runnable for a while then wake up the next
>> node and waste power...
> 
> I was thinking of a utilization measure made of per-task weighted
> runnable averages. This should indeed cover that case and we'll overflow
> when on average there is no (significant) idle time over a period longer
> than the averaging period.


It's also a good idea. :)

> 
> Anyway, I'm not too set on this and I'm very sure we can tweak this ad
> infinitum, so starting with something relatively simple that works for
> most is preferred.
> 
> As already stated, I think some of the Linaro people actually played
> around with something like this based on PJTs patches.


Vincent, would you like to introduce more?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/