lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <1364226006-21419-3-git-send-email-morten.rasmussen@arm.com>
Date:	Mon, 25 Mar 2013 15:40:06 +0000
From:	Morten Rasmussen <morten.rasmussen@....com>
To:	linux-kernel@...r.kernel.org, linaro-kernel@...ts.linaro.org,
	peterz@...radead.org, mingo@...nel.org, pjt@...gle.com,
	vincent.guittot@...aro.org
Cc:	alex.shi@...el.com, preeti@...ux.vnet.ibm.com,
	paulmck@...ux.vnet.ibm.com, tglx@...utronix.de, corbet@....net,
	amit.kucheria@...aro.org, robin.randhawa@....com,
	morten.rasmussen@....com
Subject: [RFC PATCH 2/2] sched: Pull tasks from cpus with multiple tasks when idle

If a cpu is idle and another cpu has more than one runnable task,
pull one of them without considering cpu_power source or target.
This allows low cpu_power cpus to offload potentially oversubscribed
high cpu_power cpus.

In heterogeneous systems containing cpus with different cpu_power,
the load-balancer will put more tasks on sched_domains with high
(above default) cpu_power cpus and fewer on sched_domains with low
cpu_power cpus. Hence, if the number of running tasks is equal to
the number of cpus, the load-balancer may decide to leave low
cpu_power idle and placing more than one task on each high cpu_power
cpu. This is not optimal use of the available compute resources.

Placing one task on each cpu before adding more to any of the high
cpu_power cpus should generally give a better overall throughput
regardless of the cpu_power of the cpus.

Signed-off-by: Morten Rasmussen <morten.rasmussen@....com>
Reviewed-by: Vincent Guittot <vincent.guittot@...aro.org>
---
 kernel/sched/fair.c |   21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4781cdd..095885c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4039,7 +4039,8 @@ static int move_tasks(struct lb_env *env)
 		if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed)
 			goto next;
 
-		if ((load / 2) > env->imbalance)
+		if ((load / 2) > env->imbalance &&
+			(env->idle != CPU_IDLE && env->idle != CPU_NEWLY_IDLE))
 			goto next;
 
 		if (!can_migrate_task(p, env))
@@ -4539,6 +4540,15 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 	if (overloaded_cpu)
 		sgs->group_imb = 1;
 
+	/*
+	 * When idle balancing pull tasks if more than one task per cpu
+	 * in group
+	 */
+	if (env->idle == CPU_IDLE || env->idle == CPU_NEWLY_IDLE) {
+		if (group->group_weight < sgs->sum_nr_running)
+			sgs->group_imb = 1;
+	}
+
 	sgs->group_capacity = DIV_ROUND_CLOSEST(group->sgp->power,
 						SCHED_POWER_SCALE);
 	if (!sgs->group_capacity)
@@ -4766,8 +4776,13 @@ void fix_small_imbalance(struct lb_env *env, struct sd_lb_stats *sds)
 			min(sds->this_load_per_task, sds->this_load + tmp);
 	pwr_move /= SCHED_POWER_SCALE;
 
-	/* Move if we gain throughput */
-	if (pwr_move > pwr_now)
+	/*
+	 * Move if we gain throughput, or if we have cpus idling while others
+	 * are running more than one task.
+	 */
+	if ((pwr_move > pwr_now) ||
+		(sds->busiest_group_weight < sds->busiest_nr_running &&
+		(env->idle == CPU_IDLE || env->idle == CPU_NEWLY_IDLE)))
 		env->imbalance = sds->busiest_load_per_task;
 }
 
-- 
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ