lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 26 Apr 2017 08:27:56 +0200 From: Vincent Guittot <vincent.guittot@...aro.org> To: peterz@...radead.org, mingo@...nel.org, linux-kernel@...r.kernel.org Cc: dietmar.eggemann@....com, Morten.Rasmussen@....com, yuyang.du@...el.com, pjt@...gle.com, bsegall@...gle.com, Vincent Guittot <vincent.guittot@...aro.org> Subject: [PATCH v3] sched/cfs: make util/load_avg more stable In the current implementation of load/util_avg, we assume that the ongoing time segment has fully elapsed, and util/load_sum is divided by LOAD_AVG_MAX, even if part of the time segment still remains to run. As a consequence, this remaining part is considered as idle time and generates unexpected variations of util_avg of a busy CPU in the range [1002..1024[ whereas util_avg should stay at 1023. In order to keep the metric stable, we should not consider the ongoing time segment when computing load/util_avg but only the segments that have already fully elapsed. But to not consider the current time segment adds unwanted latency in the load/util_avg responsivness especially when the time is scaled instead of the contribution. Instead of waiting for the current time segment to have fully elapsed before accounting it in load/util_avg, we can already account the elapsed part but change the range used to compute load/util_avg accordingly. At the very beginning of a new time segment, the past segments have been decayed and the max value is LOAD_AVG_MAX*y. At the very end of the current time segment, the max value becomes 1024(us) + LOAD_AVG_MAX*y which is equal to LOAD_AVG_MAX. In fact, the max value is sa->period_contrib + LOAD_AVG_MAX*y at any time in the time segment. Taking advantage of the fact that LOAD_AVG_MAX*y == LOAD_AVG_MAX-1024, the range becomes [0..LOAD_AVG_MAX-1024+sa->period_contrib]. As the elapsed part is already accounted in load/util_sum, we update the max value according to the current position in the time segment instead of removing its contribution. Suggested-by: Peter Zijlstra <peterz@...radead.org> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org> --- Changes: -Correct typo in commit message: s/MAX_LOAD_AVG/LOAD_AVG_MAX/ and square bracket kernel/sched/fair.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index a903276..3531fa1 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2916,12 +2916,12 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa, /* * Step 2: update *_avg. */ - sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX); + sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib); if (cfs_rq) { cfs_rq->runnable_load_avg = - div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX); + div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib); } - sa->util_avg = sa->util_sum / LOAD_AVG_MAX; + sa->util_avg = sa->util_sum / (LOAD_AVG_MAX - 1024 + sa->period_contrib); return 1; } -- 2.7.4
Powered by blists - more mailing lists