[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231121211725.gaekv6svnqdiq5l4@airbuntu>
Date: Tue, 21 Nov 2023 21:17:25 +0000
From: Qais Yousef <qyousef@...alina.io>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com,
rafael@...nel.org, viresh.kumar@...aro.org,
linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
lukasz.luba@....com, wyes.karny@....com, beata.michalska@....com
Subject: Re: [PATCH v3 1/2] sched/schedutil: Rework performance estimation
On 11/22/23 08:38, Vincent Guittot wrote:
> > > +unsigned long sugov_effective_cpu_perf(int cpu, unsigned long actual,
> > > + unsigned long min,
> > > + unsigned long max)
> > > +{
> > > + struct rq *rq = cpu_rq(cpu);
> > > +
> > > + if (rt_rq_is_runnable(&rq->rt))
> > > + return max;
> >
> > I think this breaks old behavior. When uclamp_is_used() the frequency of the RT
> > task is determined by uclamp_min; but you revert this to the old behavior where
> > we always return max, no? You should check for !uclamp_is_used(); otherwise let
> > the rest of the function exec as usual.
>
> Yes, I made a shortcut assuming that max would be adjusted to the max
> allowed freq for RT task whereas it's the min freq that is adjusted by
> uclamp and that should also be adjusted without uclamp. Let me fix
> that in effective_cpu_util and remove this early return from
> sugov_effective_cpu_perf()
+1
> > Can we rename this function please? It is not mapping anything, but applying
> > a dvfs headroom (I suggest apply_dvfs_headroom()). Which would make the comment
> > also unnecessary ;-)
>
> I didn't want to add unnecessary renaming which often confuses
> reviewers so I kept the current function name. But this can the be
> rename in a follow up patch
Okay.
> > > static void sugov_get_util(struct sugov_cpu *sg_cpu)
> > > {
> > > - unsigned long util = cpu_util_cfs_boost(sg_cpu->cpu);
> > > - struct rq *rq = cpu_rq(sg_cpu->cpu);
> > > + unsigned long min, max, util = cpu_util_cfs_boost(sg_cpu->cpu);
> > >
> > > - sg_cpu->bw_dl = cpu_bw_dl(rq);
> > > - sg_cpu->util = effective_cpu_util(sg_cpu->cpu, util,
> > > - FREQUENCY_UTIL, NULL);
> > > + util = effective_cpu_util(sg_cpu->cpu, util, &min, &max);
> > > + sg_cpu->bw_min = map_util_perf(min);
> >
> > Hmm. I don't think we need to apply_dvfs_headroom() to min here. What's the
> > rationale to give headroom for min perf requirement? I think the headroom is
> > only required for actual util.
>
> This headroom only applies for bw_min that is used with
> cpufreq_driver_adjust_perf(). Currently it only takes cpu_bw_dl()
It is also used in ignore_dl_rate_limit() - which is the user that caught my
eyes more.
I have to admit, I always get caught out with the new adjust_perf stuff. The
down side of working on older LTS kernels for prolonged time :p
> which seems too low because IRQ can preempt DL. So I added the average
> irq utilization into bw_min which is only an estimate and needs some
> headroom. That being said I can probably stay with current behavior
> for now and remove headroom
I think this is more logical IMHO. DL should never need any headroom. And irq
needing headroom is questionable everytime I think about it. Does an irq storm
need a dvfs headroom? I don't think it's a clear cut answer, but I tend towards
no.
> > And is it right to mix irq and uclamp_min with bw_min which is for DL? We might
>
> cpu_bw_dl() is not the actual utilization by DL task but the computed
> bandwidth which can be seen as min performance level
Yep. That's why I am not in favour of a dvfs headroom for DL.
But what I meant here is that in effective_cpu_util(), where we populate min
and max we have
if (min) {
/*
* The minimum utilization returns the highest level between:
* - the computed DL bandwidth needed with the irq pressure which
* steals time to the deadline task.
* - The minimum performance requirement for CFS and/or RT.
*/
*min = max(irq + cpu_bw_dl(rq), uclamp_rq_get(rq, UCLAMP_MIN));
So if there was an RT/CFS task requesting a UCLAMP_MIN of 1024 for example,
bw_min will end up being too high, no?
Should we add another arg to sugov_effective_cpu_perf() to populate bw_min too
for the single user who wants it?
Thanks!
--
Qais Yousef
Powered by blists - more mailing lists