[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1779842.1JHXT67au9@vostro.rjw.lan>
Date: Wed, 31 Aug 2016 03:31:07 +0200
From: "Rafael J. Wysocki" <rjw@...ysocki.net>
To: Steve Muckle <steve.muckle@...aro.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
"Rafael J . Wysocki" <rafael@...nel.org>,
linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
Vincent Guittot <vincent.guittot@...aro.org>,
Morten Rasmussen <morten.rasmussen@....com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Juri Lelli <Juri.Lelli@....com>,
Patrick Bellasi <patrick.bellasi@....com>,
Steve Muckle <smuckle@...aro.org>
Subject: Re: [PATCH 2/2] sched: cpufreq: use rt_avg as estimate of required RT CPU capacity
On Friday, August 26, 2016 11:40:48 AM Steve Muckle wrote:
> A policy of going to fmax on any RT activity will be detrimental
> for power on many platforms. Often RT accounts for only a small amount
> of CPU activity so sending the CPU frequency to fmax is overkill. Worse
> still, some platforms may not be able to even complete the CPU frequency
> change before the RT activity has already completed.
>
> Cpufreq governors have not treated RT activity this way in the past so
> it is not part of the expected semantics of the RT scheduling class. The
> DL class offers guarantees about task completion and could be used for
> this purpose.
>
> Modify the schedutil algorithm to instead use rt_avg as an estimate of
> RT utilization of the CPU.
>
> Based on previous work by Vincent Guittot <vincent.guittot@...aro.org>.
If we do it for RT, why not to do a similar thing for DL? As in the
original patch from Peter, for example?
> Signed-off-by: Steve Muckle <smuckle@...aro.org>
> ---
> kernel/sched/cpufreq_schedutil.c | 26 +++++++++++++++++---------
> 1 file changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index cb8a77b1ef1b..89094a466250 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -146,13 +146,21 @@ static unsigned int get_next_freq(struct sugov_cpu *sg_cpu, unsigned long util,
>
> static void sugov_get_util(unsigned long *util, unsigned long *max)
> {
> - struct rq *rq = this_rq();
> - unsigned long cfs_max;
> + int cpu = smp_processor_id();
> + struct rq *rq = cpu_rq(cpu);
> + unsigned long max_cap, rt;
> + s64 delta;
>
> - cfs_max = arch_scale_cpu_capacity(NULL, smp_processor_id());
> + max_cap = arch_scale_cpu_capacity(NULL, cpu);
>
> - *util = min(rq->cfs.avg.util_avg, cfs_max);
> - *max = cfs_max;
> + delta = rq_clock(rq) - rq->age_stamp;
> + if (unlikely(delta < 0))
> + delta = 0;
> + rt = div64_u64(rq->rt_avg, sched_avg_period() + delta);
> + rt = (rt * max_cap) >> SCHED_CAPACITY_SHIFT;
These computations are rather heavy, so I wonder if they are avoidable based
on the flags, for example?
Plus is SCHED_CAPACITY_SHIFT actually defined for all architectures?
One more ugly thing is about using rq_clock(rq) directly from here whereas we
pass it around as the 'time' argument elsewhere.
> +
> + *util = min(rq->cfs.avg.util_avg + rt, max_cap);
> + *max = max_cap;
> }
Thanks,
Rafael
Powered by blists - more mailing lists