[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <tip-625ed2bf049d5a352c1bcca962d6e133454eaaff@git.kernel.org>
Date: Mon, 15 May 2017 01:55:44 -0700
From: tip-bot for Vincent Guittot <tipbot@...or.com>
To: linux-tip-commits@...r.kernel.org
Cc: mingo@...nel.org, peterz@...radead.org, vincent.guittot@...aro.org,
hpa@...or.com, torvalds@...ux-foundation.org, efault@....de,
tglx@...utronix.de, linux-kernel@...r.kernel.org
Subject: [tip:sched/core] sched/cfs: Make util/load_avg more stable
Commit-ID: 625ed2bf049d5a352c1bcca962d6e133454eaaff
Gitweb: http://git.kernel.org/tip/625ed2bf049d5a352c1bcca962d6e133454eaaff
Author: Vincent Guittot <vincent.guittot@...aro.org>
AuthorDate: Wed, 26 Apr 2017 08:27:56 +0200
Committer: Ingo Molnar <mingo@...nel.org>
CommitDate: Mon, 15 May 2017 10:15:13 +0200
sched/cfs: Make util/load_avg more stable
In the current implementation of load/util_avg, we assume that the
ongoing time segment has fully elapsed, and util/load_sum is divided
by LOAD_AVG_MAX, even if part of the time segment still remains to
run. As a consequence, this remaining part is considered as idle time
and generates unexpected variations of util_avg of a busy CPU in the
range [1002..1024[ whereas util_avg should stay at 1023.
In order to keep the metric stable, we should not consider the ongoing
time segment when computing load/util_avg but only the segments that
have already fully elapsed. But to not consider the current time
segment adds unwanted latency in the load/util_avg responsivness
especially when the time is scaled instead of the contribution.
Instead of waiting for the current time segment to have fully elapsed
before accounting it in load/util_avg, we can already account the
elapsed part but change the range used to compute load/util_avg
accordingly.
At the very beginning of a new time segment, the past segments have
been decayed and the max value is LOAD_AVG_MAX*y. At the very end of
the current time segment, the max value becomes:
LOAD_AVG_MAX*y + 1024(us) (== LOAD_AVG_MAX)
In fact, the max value is:
LOAD_AVG_MAX*y + sa->period_contrib
at any time in the time segment.
Taking advantage of the fact that:
LOAD_AVG_MAX*y == LOAD_AVG_MAX-1024
the range becomes [0..LOAD_AVG_MAX-1024+sa->period_contrib].
As the elapsed part is already accounted in load/util_sum, we update
the max value according to the current position in the time segment
instead of removing its contribution.
Suggested-by: Peter Zijlstra <peterz@...radead.org>
Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Mike Galbraith <efault@....de>
Cc: Morten.Rasmussen@....com
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: bsegall@...gle.com
Cc: dietmar.eggemann@....com
Cc: pjt@...gle.com
Cc: yuyang.du@...el.com
Link: http://lkml.kernel.org/r/1493188076-2767-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@...nel.org>
---
kernel/sched/fair.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d711093..4f1825d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2916,12 +2916,12 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
/*
* Step 2: update *_avg.
*/
- sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX);
+ sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib);
if (cfs_rq) {
cfs_rq->runnable_load_avg =
- div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX);
+ div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib);
}
- sa->util_avg = sa->util_sum / LOAD_AVG_MAX;
+ sa->util_avg = sa->util_sum / (LOAD_AVG_MAX - 1024 + sa->period_contrib);
return 1;
}
Powered by blists - more mailing lists