linux-kernel - Re: [discussion]sched: a rough proposal to enable power saving in scheduler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5029FFB0.4020309@intel.com>
Date:	Tue, 14 Aug 2012 15:35:12 +0800
From:	Alex Shi <alex.shi@...el.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Suresh Siddha <suresh.b.siddha@...el.com>,
	Arjan van de Ven <arjan@...ux.intel.com>,
	vincent.guittot@...aro.org, svaidy@...ux.vnet.ibm.com,
	Ingo Molnar <mingo@...nel.org>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [discussion]sched: a rough proposal to enable power saving in
 scheduler

On 08/13/2012 08:21 PM, Alex Shi wrote:

> Since there is no power saving consideration in scheduler CFS, I has a
> very rough idea for enabling a new power saving schema in CFS.
> 
> It bases on the following assumption:
> 1, If there are many task crowd in system, just let few domain cpus
> running and let other cpus idle can not save power. Let all cpu take the
> load, finish tasks early, and then get into idle. will save more power
> and have better user experience.
> 
> 2, schedule domain, schedule group perfect match the hardware, and
> the power consumption unit. So, pull tasks out of a domain means
> potentially this power consumption unit idle.
> 
> So, according Peter mentioned in commit 8e7fbcbc22c(sched: Remove stale
> power aware scheduling), this proposal will adopt the
> sched_balance_policy concept and use 2 kind of policy: performance, power.
> 
> And in scheduling, 2 place will care the policy, load_balance() and in
> task fork/exec: select_task_rq_fair().



Any comments for this rough proposal, specially for the assumptions?

> 
> Here is some pseudo code try to explain the proposal behaviour in
> load_balance() and select_task_rq_fair();
> 
> 
> load_balance() {
> 	update_sd_lb_stats(); //get busiest group, idlest group data.
> 
> 	if (sd->nr_running > sd's capacity) {
> 		//power saving policy is not suitable for
> 		//this scenario, it runs like performance policy
> 		mv tasks from busiest cpu in busiest group to
> 		idlest 	cpu in idlest group;
> 	} else {// the sd has enough capacity to hold all tasks.
> 		if (sg->nr_running > sg's capacity) {
> 			//imbalanced between groups
> 			if (schedule policy == performance) {
> 				//when 2 busiest group at same busy
> 				//degree, need to prefer the one has
> 				// softest group??
> 				move tasks from busiest group to
> 					idletest group;
> 			} else if (schedule policy == power)
> 				move tasks from busiest group to
> 				idlest group until busiest is just full
> 				of capacity.
> 				//the busiest group can balance
> 				//internally after next time LB,
> 		} else {
> 			//all groups has enough capacity for its tasks.
> 			if (schedule policy == performance)
> 				//all tasks may has enough cpu
> 				//resources to run,
> 				//mv tasks from busiest to idlest group?
> 				//no, at this time, it's better to keep
> 				//the task on current cpu.
> 				//so, it is maybe better to do balance
> 				//in each of groups
> 				for_each_imbalance_groups()
> 					move tasks from busiest cpu to
> 					idlest cpu in each of groups;
> 			else if (schedule policy == power) {
> 				if (no hard pin in idlest group)
> 					mv tasks from idlest group to
> 					busiest until busiest full.
> 				else
> 					mv unpin tasks to the biggest
> 					hard pin group.
> 			}
> 		}
> 	}
> }
> 
> select_task_rq_fair()
> {
> 	for_each_domain(cpu, tmp) {
> 		if (policy == power && tmp_has_capacity &&
> 			 tmp->flags & sd_flag) {
> 			sd = tmp;
> 			//It is fine to got cpu in the domain
> 			break;
> 		}
> 	}
> 
> 	while(sd) {
> 		if policy == power
> 			find_busiest_and_capable_group()
> 		else
> 			find_idlest_group();
> 		if (!group) {
> 			sd = sd->child;
> 			continue;
> 		}
> 		...
> 	}
> }
> 
> sub proposal:
> 1, If it's possible to balance task on idlest cpu not appointed 'balance
> cpu'. If so, it may can reduce one more time balancing.
> The idlest cpu can prefer the new idle cpu;  and is the least load cpu;
> 2, se or task load is good for running time setting.
> but it should the second basis in load balancing. The first basis of LB
> is running tasks' number in group/cpu. Since whatever of the weight of
> groups is, if the tasks number is less than cpu number, the group is
> still has capacity to take more tasks. (will consider the SMT cpu power
> or other big/little cpu capacity on ARM.)
> 
> unsolved issues:
> 1, like current scheduler, it didn't handled cpu affinity well in
> load_balance.
> 2, task group that isn't consider well in this rough proposal.
> 
> It isn't consider well and may has mistaken . So just share my ideas and
> hope it become better and workable in your comments and discussion.
> 
> Thanks
> Alex


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/