linux-kernel - Re: [PATCH] sched/fair: Prevent cpu_busy_time from exceeding actual_cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKfTPtAgXHDjjPhNhDPZzWbPX-DNJzb5TH9DeF-cYOcEC=4igg@mail.gmail.com>
Date: Tue, 18 Jun 2024 23:05:44 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Qais Yousef <qyousef@...alina.io>
Cc: Xuewen Yan <xuewen.yan94@...il.com>, Xuewen Yan <xuewen.yan@...soc.com>, mingo@...hat.com, 
	peterz@...radead.org, juri.lelli@...hat.com, dietmar.eggemann@....com, 
	rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com, 
	vschneid@...hat.com, vincent.donnefort@....com, ke.wang@...soc.com, 
	linux-kernel@...r.kernel.org, christian.loehle@....com
Subject: Re: [PATCH] sched/fair: Prevent cpu_busy_time from exceeding actual_cpu_capacity

On Tue, 18 Jun 2024 at 17:39, Qais Yousef <qyousef@...alina.io> wrote:
>
> On 06/18/24 17:23, Vincent Guittot wrote:
> > On Mon, 17 Jun 2024 at 12:53, Qais Yousef <qyousef@...alina.io> wrote:
> > >
> > > On 06/17/24 11:07, Vincent Guittot wrote:
> > >
> > > > > And should effective_cpu_util() return a value higher than
> > > > > get_actual_cpu_capacity()?
> > > >
> > > > I don't think we should because we want to return the effective
> > > > utilization not the actual compute capacity.
> > > > Having an utilization of the cpu or group of cpus above the actual
> > > > capacity or the original capacity mainly means that we will have to
> > > > run longer
> > > >
> > > > By capping the utilization we filter this information.
> > > >
> > > > capacity orig = 800
> > > > util_avg = 700
> > > >
> > > > if we cap the capacity to 400 the cpu is expected to run twice longer
> > > > for the same amount of work to be done
> > >
> > > Okay makes sense. Wouldn't the util be 'wrong' (to what degree will depend on
> > > min/max freq ratio) though?
> > >
> > > We cap with arch_scale_capacity() still, I guess we know at this stage it is
> > > 100% wrong if we allow returning higher values?
> >
> > I think that capping utilization to max capacity generates some energy
> > estimation error because it filters the fact that we run longer in
> > some cases.
>
> Yes, I think so too and that was my first statement. But I think this is
> a bigger change to do separately.
>
> I *think* we have another source of error, we take util/cpu_cap as a percentage
> of time the CPU is busy. We assume an implicit multiplication with a time
> period, T. I am not sure if this implicit assumption is accurate and things are
> aligned properly. Especially with how utilization loses the temporal info due
> to invariance. util can be low but actual runtime will be much longer. I'm not

I'm not sure to get what you mean by " how utilization loses the
temporal info due to invariance"

Utilization aims to estimate the number of instructions to execute
whatever the CPU of the system, which once divided by the compute
capacity of the OPP of a CPU will estimate how long it will take to do
the job. So if the capa of an OPP of a CPU is low, it will reflect
that the actual runtime will be much longer.  A low utilization means
that you don't have much instruction to execute but not the speed at
which you will execute them.

Then, problems start when we cap utilization to the CPU capacity as an
example because we cap this temporal info.

> sure if this implicit multiplication is handling this properly. Beside due
> performance domains having shared CPUs, I am not sure this period is aligned
> across all CPUs for this implicit multiplication to work as intended.

It's all about average because it's too expensive if not even possible
to know when the instruction will be executed on the other CPUs. We
can only take the edge case (currently the worst case)

Beside the impact of uclamp making the selected OPP not always
sustainable but sometimes temporary

>
> I yet to study this properly. But I thought I'll mention it as I think this
> (energy estimation) is increasingly becoming an important area to improve on.