linux-kernel - Re: power-efficient scheduling design

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <51B1FAE4.2070700@linux.intel.com>
Date:	Fri, 07 Jun 2013 08:23:16 -0700
From:	Arjan van de Ven <arjan@...ux.intel.com>
To:	Preeti U Murthy <preeti@...ux.vnet.ibm.com>
CC:	Ingo Molnar <mingo@...nel.org>,
	Morten Rasmussen <morten.rasmussen@....com>,
	alex.shi@...el.com, peterz@...radead.org,
	vincent.guittot@...aro.org, efault@....de, pjt@...gle.com,
	linux-kernel@...r.kernel.org, linaro-kernel@...ts.linaro.org,
	len.brown@...el.com, corbet@....net,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	tglx@...utronix.de
Subject: Re: power-efficient scheduling design

On 6/6/2013 11:03 PM, Preeti U Murthy wrote:
> Hi,
>
> On 05/31/2013 04:22 PM, Ingo Molnar wrote:
>> PeterZ and me tried to point out the design requirements previously, but
>> it still does not appear to be clear enough to people, so let me spell it
>> out again, in a hopefully clearer fashion.
>>
>> The scheduler has valuable power saving information available:
>>
>>   - when a CPU is busy: about how long the current task expects to run
>>
>>   - when a CPU is idle: how long the current CPU expects _not_ to run
>>
>>   - topology: it knows how the CPUs and caches interrelate and already
>>     optimizes based on that

and I will argue we do too much of this already; various caches (and tlbs) get flushed
(on x86 at least) much much more than you'd think.

>>
>> so the scheduler is in an _ideal_ position to do a judgement call about
>> the near future

this part I will buy

>> and estimate how deep an idle state a CPU core should
>> enter into and what frequency it should run at.

this part I cannot buy.
First of all, we really need to stop thinking about choosing frequency (at least for x86).
that concept basically died for x86 6 years ago.

Second, the interactions between these two, and the "what does it mean if I chose something"
is highly hardware specific and complex nowadays, and going forward is going to be increasingly so.
If anything, we've been moving AWAY from centralized infrastructure there, going towards
CPU specific drivers/policies. And hardware rules are very different between platforms here.
On Intel, asking for different performance is just an MSR write, and going idle is usually just one instruction.
On some ARM, this might involve a long complex interaction calculations, or even *blocking* operation manipulating VRs and PLLs directly... depending
on the platform and the states you want to pick. (Hence the CPUFREQ design of requiring changes to be
done in a kernel thread)

Now, I would like the scheduler to give some notifications at certain events (like migrations,
starting realtime tasks)...but a few atomic notifier chains will do for that.

The policies will be very hardware specific, and thus will live outside the scheduler, no matter which way you
put it. Now, the scheduler can and should participate more in terms of sharing information in both directions...
that I think we can all agree on.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/