linux-kernel - [tip:sched/core] sched/fair: Make utilization tracking CPU scale-invariant

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <tip-e3279a2e6d697e00e74f905851ee7cf532f72b2d@git.kernel.org>
Date:	Sun, 13 Sep 2015 04:04:06 -0700
From:	tip-bot for Dietmar Eggemann <tipbot@...or.com>
To:	linux-tip-commits@...r.kernel.org
Cc:	morten.rasmussen@....com, efault@....de, peterz@...radead.org,
	hpa@...or.com, mingo@...nel.org, tglx@...utronix.de,
	linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
	dietmar.eggemann@....com
Subject: [tip:sched/core] sched/fair:
  Make utilization tracking CPU scale-invariant

Commit-ID:  e3279a2e6d697e00e74f905851ee7cf532f72b2d
Gitweb:     http://git.kernel.org/tip/e3279a2e6d697e00e74f905851ee7cf532f72b2d
Author:     Dietmar Eggemann <dietmar.eggemann@....com>
AuthorDate: Sat, 15 Aug 2015 00:04:41 +0100
Committer:  Ingo Molnar <mingo@...nel.org>
CommitDate: Sun, 13 Sep 2015 09:52:56 +0200

sched/fair: Make utilization tracking CPU scale-invariant

Besides the existing frequency scale-invariance correction factor, apply
CPU scale-invariance correction factor to utilization tracking to
compensate for any differences in compute capacity. This could be due to
micro-architectural differences (i.e. instructions per seconds) between
cpus in HMP systems (e.g. big.LITTLE), and/or differences in the current
maximum frequency supported by individual cpus in SMP systems. In the
existing implementation utilization isn't comparable between cpus as it
is relative to the capacity of each individual CPU.

Each segment of the sched_avg.util_sum geometric series is now scaled
by the CPU performance factor too so the sched_avg.util_avg of each
sched entity will be invariant from the particular CPU of the HMP/SMP
system on which the sched entity is scheduled.

With this patch, the utilization of a CPU stays relative to the max CPU
performance of the fastest CPU in the system.

In contrast to utilization (sched_avg.util_sum), load
(sched_avg.load_sum) should not be scaled by compute capacity. The
utilization metric is based on running time which only makes sense when
cpus are _not_ fully utilized (utilization cannot go beyond 100% even if
more tasks are added), where load is runnable time which isn't limited
by the capacity of the CPU and therefore is a better metric for
overloaded scenarios. If we run two nice-0 busy loops on two cpus with
different compute capacity their load should be similar since their
compute demands are the same. We have to assume that the compute demand
of any task running on a fully utilized CPU (no spare cycles = 100%
utilization) is high and the same no matter of the compute capacity of
its current CPU, hence we shouldn't scale load by CPU capacity.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@....com>
Signed-off-by: Morten Rasmussen <morten.rasmussen@....com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Mike Galbraith <efault@....de>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Link: http://lkml.kernel.org/r/55CE7409.1000700@arm.com
Signed-off-by: Ingo Molnar <mingo@...nel.org>
---
 include/linux/sched.h | 2 +-
 kernel/sched/fair.c   | 7 ++++---
 kernel/sched/sched.h  | 2 +-
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index c8d923b..bd38b3e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1180,7 +1180,7 @@ struct load_weight {
  * 1) load_avg factors frequency scaling into the amount of time that a
  * sched_entity is runnable on a rq into its weight. For cfs_rq, it is the
  * aggregated such weights of all runnable and blocked sched_entities.
- * 2) util_avg factors frequency scaling into the amount of time
+ * 2) util_avg factors frequency and cpu scaling into the amount of time
  * that a sched_entity is running on a CPU, in the range [0..SCHED_LOAD_SCALE].
  * For cfs_rq, it is the aggregated such times of all runnable and
  * blocked sched_entities.
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 102cdf1..573dc98 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2553,6 +2553,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
 	u32 contrib;
 	int delta_w, scaled_delta_w, decayed = 0;
 	unsigned long scale_freq = arch_scale_freq_capacity(NULL, cpu);
+	unsigned long scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
 
 	delta = now - sa->last_update_time;
 	/*
@@ -2596,7 +2597,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
 			}
 		}
 		if (running)
-			sa->util_sum += scaled_delta_w;
+			sa->util_sum += scale(scaled_delta_w, scale_cpu);
 
 		delta -= delta_w;
 
@@ -2620,7 +2621,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
 				cfs_rq->runnable_load_sum += weight * contrib;
 		}
 		if (running)
-			sa->util_sum += contrib;
+			sa->util_sum += scale(contrib, scale_cpu);
 	}
 
 	/* Remainder of delta accrued against u_0` */
@@ -2631,7 +2632,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
 			cfs_rq->runnable_load_sum += weight * scaled_delta;
 	}
 	if (running)
-		sa->util_sum += scaled_delta;
+		sa->util_sum += scale(scaled_delta, scale_cpu);
 
 	sa->period_contrib += delta;
 
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index c0726d5..167ab48 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1398,7 +1398,7 @@ unsigned long arch_scale_freq_capacity(struct sched_domain *sd, int cpu)
 static __always_inline
 unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
 {
-	if ((sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
+	if (sd && (sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
 		return sd->smt_gain / sd->span_weight;
 
 	return SCHED_CAPACITY_SCALE;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/