[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtB1X8TLiTrMsNebtMsZVDPZsePUK-Bs2Zn0cxaZnyBFbQ@mail.gmail.com>
Date: Fri, 4 Sep 2015 09:52:47 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Morten Rasmussen <morten.rasmussen@....com>,
"peterz@...radead.org" <peterz@...radead.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"daniel.lezcano@...aro.org" <daniel.lezcano@...aro.org>,
"yuyang.du@...el.com" <yuyang.du@...el.com>,
"mturquette@...libre.com" <mturquette@...libre.com>,
"rjw@...ysocki.net" <rjw@...ysocki.net>,
Juri Lelli <Juri.Lelli@....com>,
"sgurrappadi@...dia.com" <sgurrappadi@...dia.com>,
"pang.xunlei@....com.cn" <pang.xunlei@....com.cn>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 3/6] sched/fair: Make utilization tracking cpu scale-invariant
On 15 August 2015 at 01:04, Dietmar Eggemann <dietmar.eggemann@....com> wrote:
> On 14/08/15 17:23, Morten Rasmussen wrote:
>> From: Dietmar Eggemann <dietmar.eggemann@....com>
>
> [...]
>
>> @@ -2596,7 +2597,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
>> }
>> }
>> if (running)
>> - sa->util_sum += scaled_delta_w;
>> + sa->util_sum = scale(scaled_delta_w, scale_cpu);
>
>
> There is a small issue (using = instead of +=) with fatal consequences
> for the utilization signal.
>
> -- >8 --
>
> Subject: [PATCH] sched/fair: Make utilization tracking cpu scale-invariant
>
> Besides the existing frequency scale-invariance correction factor, apply
> cpu scale-invariance correction factor to utilization tracking to
> compensate for any differences in compute capacity. This could be due to
> micro-architectural differences (i.e. instructions per seconds) between
> cpus in HMP systems (e.g. big.LITTLE), and/or differences in the current
> maximum frequency supported by individual cpus in SMP systems. In the
> existing implementation utilization isn't comparable between cpus as it
> is relative to the capacity of each individual cpu.
>
> Each segment of the sched_avg.util_sum geometric series is now scaled
> by the cpu performance factor too so the sched_avg.util_avg of each
> sched entity will be invariant from the particular cpu of the HMP/SMP
> system on which the sched entity is scheduled.
>
> With this patch, the utilization of a cpu stays relative to the max cpu
> performance of the fastest cpu in the system.
>
> In contrast to utilization (sched_avg.util_sum), load
> (sched_avg.load_sum) should not be scaled by compute capacity. The
> utilization metric is based on running time which only makes sense when
> cpus are _not_ fully utilized (utilization cannot go beyond 100% even if
> more tasks are added), where load is runnable time which isn't limited
> by the capacity of the cpu and therefore is a better metric for
> overloaded scenarios. If we run two nice-0 busy loops on two cpus with
> different compute capacity their load should be similar since their
> compute demands are the same. We have to assume that the compute demand
> of any task running on a fully utilized cpu (no spare cycles = 100%
> utilization) is high and the same no matter of the compute capacity of
> its current cpu, hence we shouldn't scale load by cpu capacity.
>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@....com>
> Signed-off-by: Morten Rasmussen <morten.rasmussen@....com>
> ---
> include/linux/sched.h | 2 +-
> kernel/sched/fair.c | 7 ++++---
> kernel/sched/sched.h | 2 +-
> 3 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index a15305117ace..78a93d716fcb 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1180,7 +1180,7 @@ struct load_weight {
> * 1) load_avg factors frequency scaling into the amount of time that a
> * sched_entity is runnable on a rq into its weight. For cfs_rq, it is the
> * aggregated such weights of all runnable and blocked sched_entities.
> - * 2) util_avg factors frequency scaling into the amount of time
> + * 2) util_avg factors frequency and cpu scaling into the amount of time
> * that a sched_entity is running on a CPU, in the range [0..SCHED_LOAD_SCALE].
> * For cfs_rq, it is the aggregated such times of all runnable and
> * blocked sched_entities.
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index c72223a299a8..3321eb13e422 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2553,6 +2553,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
> u32 contrib;
> int delta_w, scaled_delta_w, decayed = 0;
> unsigned long scale_freq = arch_scale_freq_capacity(NULL, cpu);
> + unsigned long scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
>
> delta = now - sa->last_update_time;
> /*
> @@ -2596,7 +2597,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
> }
> }
> if (running)
> - sa->util_sum += scaled_delta_w;
> + sa->util_sum += scale(scaled_delta_w, scale_cpu);
>
> delta -= delta_w;
>
> @@ -2620,7 +2621,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
> cfs_rq->runnable_load_sum += weight * contrib;
> }
> if (running)
> - sa->util_sum += contrib;
> + sa->util_sum += scale(contrib, scale_cpu);
> }
>
> /* Remainder of delta accrued against u_0` */
> @@ -2631,7 +2632,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
> cfs_rq->runnable_load_sum += weight * scaled_delta;
> }
> if (running)
> - sa->util_sum += scaled_delta;
> + sa->util_sum += scale(scaled_delta, scale_cpu);
>
> sa->period_contrib += delta;
>
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 7e6f2506a402..50836a9301f9 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1406,7 +1406,7 @@ unsigned long arch_scale_freq_capacity(struct sched_domain *sd, int cpu)
> static __always_inline
> unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
> {
> - if ((sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
> + if (sd && (sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
> return sd->smt_gain / sd->span_weight;
>
> return SCHED_CAPACITY_SCALE;
FWIW, you can add my Acked-by: Vincent Guittot
<vincent.guittot@...aro.org> on this corrected version
> --
> 1.9.1
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists