[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180622075853.GC23168@e108498-lin.cambridge.arm.com>
Date: Fri, 22 Jun 2018 08:58:53 +0100
From: Quentin Perret <quentin.perret@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Vincent Guittot <vincent.guittot@...aro.org>, mingo@...nel.org,
linux-kernel@...r.kernel.org, rjw@...ysocki.net,
juri.lelli@...hat.com, dietmar.eggemann@....com,
Morten.Rasmussen@....com, viresh.kumar@...aro.org,
valentin.schneider@....com, patrick.bellasi@....com,
joel@...lfernandes.org, daniel.lezcano@...aro.org,
Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH v6 04/11] cpufreq/schedutil: use rt utilization tracking
Hi Peter,
On Thursday 21 Jun 2018 at 20:45:24 (+0200), Peter Zijlstra wrote:
> On Fri, Jun 08, 2018 at 02:09:47PM +0200, Vincent Guittot wrote:
> > static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
> > {
> > struct rq *rq = cpu_rq(sg_cpu->cpu);
> > + unsigned long util;
> >
> > if (rq->rt.rt_nr_running)
> > return sg_cpu->max;
> >
> > + util = sg_cpu->util_dl;
> > + util += sg_cpu->util_cfs;
> > + util += sg_cpu->util_rt;
> > +
> > /*
> > * Utilization required by DEADLINE must always be granted while, for
> > * FAIR, we use blocked utilization of IDLE CPUs as a mechanism to
> > @@ -197,7 +204,7 @@ static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
> > * util_cfs + util_dl as requested freq. However, cpufreq is not yet
> > * ready for such an interface. So, we only do the latter for now.
> > */
> > - return min(sg_cpu->max, (sg_cpu->util_dl + sg_cpu->util_cfs));
> > + return min(sg_cpu->max, util);
> > }
>
> So this (and the dl etc. equivalents) result in exactly the problems
> complained about last time, no?
>
> What I proposed was something along the lines of:
>
> util = 1024 * sg_cpu->util_cfs;
> util /= (1024 - (sg_cpu->util_rt + sg_cpu->util_dl + ...));
>
> return min(sg_cpu->max, util + sg_cpu->bw_dl);
>
> Where we, instead of directly adding the various util signals.
>
> I now see an email from Quentin asking if these things are not in fact
> the same, but no, they are not. The difference is that the above only
> affects the CFS signal and will re-normalize the utilization of an
> 'always' running task back to 1 by compensating for the stolen capacity.
>
> But it will not, like these here patches, affect the OPP selection of
> other classes. If there is no CFS utilization (or very little), then the
> renormalization will not matter, and the existing DL bandwidth
> compuation will be unaffected.
Right, thinking more carefully about this re-scaling, the two things are
indeed not the same, but I'm still not sure if this is what we want.
Say we have 50% of the capacity stolen by RT, and a 25% CFS task
running. If we re-scale, we'll end up with a 50% request for CFS
(util==512 for your code above). But if we want to see a little bit
of idle time in the system, we should really request an OPP for 75%+ of
capacity no ? Or am I missing something ?
And also, I think Juri had concerns when we use the util_dl (as a PELT
signal) for OPP selection since that kills the benefit of DL for long
running DL tasks. Or can we assume that DL tasks with very long
runtime/periods are a corner case we can ignore ?
Thanks,
Quentin
Powered by blists - more mailing lists