linux-kernel - Re: [PATCH v3 1/2] sched/schedutil: Rework performance estimation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20231121211725.gaekv6svnqdiq5l4@airbuntu>
Date:   Tue, 21 Nov 2023 21:17:25 +0000
From:   Qais Yousef <qyousef@...alina.io>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
        mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com,
        rafael@...nel.org, viresh.kumar@...aro.org,
        linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        lukasz.luba@....com, wyes.karny@....com, beata.michalska@....com
Subject: Re: [PATCH v3 1/2] sched/schedutil: Rework performance estimation

On 11/22/23 08:38, Vincent Guittot wrote:

> > > +unsigned long sugov_effective_cpu_perf(int cpu, unsigned long actual,
> > > +                              unsigned long min,
> > > +                              unsigned long max)
> > > +{
> > > +     struct rq *rq = cpu_rq(cpu);
> > > +
> > > +     if (rt_rq_is_runnable(&rq->rt))
> > > +             return max;
> >
> > I think this breaks old behavior. When uclamp_is_used() the frequency of the RT
> > task is determined by uclamp_min; but you revert this to the old behavior where
> > we always return max, no? You should check for !uclamp_is_used(); otherwise let
> > the rest of the function exec as usual.
> 
> Yes, I made a shortcut assuming that max would be adjusted to the max
> allowed freq for RT task whereas it's the min freq that is adjusted by
> uclamp and that should also be adjusted without uclamp. Let me fix
> that in effective_cpu_util and remove this early return from
> sugov_effective_cpu_perf()

+1

> > Can we rename this function please? It is not mapping anything, but applying
> > a dvfs headroom (I suggest apply_dvfs_headroom()). Which would make the comment
> > also unnecessary ;-)
> 
> I didn't want to add unnecessary renaming which often confuses
> reviewers so I kept  the current function name. But this can the be
> rename in a follow up patch

Okay.

> > >  static void sugov_get_util(struct sugov_cpu *sg_cpu)
> > >  {
> > > -     unsigned long util = cpu_util_cfs_boost(sg_cpu->cpu);
> > > -     struct rq *rq = cpu_rq(sg_cpu->cpu);
> > > +     unsigned long min, max, util = cpu_util_cfs_boost(sg_cpu->cpu);
> > >
> > > -     sg_cpu->bw_dl = cpu_bw_dl(rq);
> > > -     sg_cpu->util = effective_cpu_util(sg_cpu->cpu, util,
> > > -                                       FREQUENCY_UTIL, NULL);
> > > +     util = effective_cpu_util(sg_cpu->cpu, util, &min, &max);
> > > +     sg_cpu->bw_min = map_util_perf(min);
> >
> > Hmm. I don't think we need to apply_dvfs_headroom() to min here. What's the
> > rationale to give headroom for min perf requirement? I think the headroom is
> > only required for actual util.
> 
> This headroom only applies for bw_min that is used with
> cpufreq_driver_adjust_perf(). Currently it only takes cpu_bw_dl()

It is also used in ignore_dl_rate_limit() - which is the user that caught my
eyes more.

I have to admit, I always get caught out with the new adjust_perf stuff. The
down side of working on older LTS kernels for prolonged time :p

> which seems too low because IRQ can preempt DL. So I added the average
> irq utilization into bw_min which is only an estimate and needs some
> headroom. That being said I can probably stay with current behavior
> for now and remove headroom

I think this is more logical IMHO. DL should never need any headroom. And irq
needing headroom is questionable everytime I think about it. Does an irq storm
need a dvfs headroom? I don't think it's a clear cut answer, but I tend towards
no.

> > And is it right to mix irq and uclamp_min with bw_min which is for DL? We might
> 
> cpu_bw_dl() is not the actual utilization by DL task but the computed
> bandwidth which can be seen as min performance level

Yep. That's why I am not in favour of a dvfs headroom for DL.

But what I meant here is that in effective_cpu_util(), where we populate min
and max we have

	if (min) {
	        /*
	         * The minimum utilization returns the highest level between:
	         * - the computed DL bandwidth needed with the irq pressure which
	         *   steals time to the deadline task.
	         * - The minimum performance requirement for CFS and/or RT.
	         */
	        *min = max(irq + cpu_bw_dl(rq), uclamp_rq_get(rq, UCLAMP_MIN));

So if there was an RT/CFS task requesting a UCLAMP_MIN of 1024 for example,
bw_min will end up being too high, no?

Should we add another arg to sugov_effective_cpu_perf() to populate bw_min too
for the single user who wants it?


Thanks!

--
Qais Yousef