lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 27 Jun 2024 10:02:24 +0800
From: Xuewen Yan <xuewen.yan94@...il.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Xuewen Yan <xuewen.yan@...soc.com>, dietmar.eggemann@....com, mingo@...hat.com, 
	peterz@...radead.org, juri.lelli@...hat.com, qyousef@...alina.io, 
	rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com, 
	vschneid@...hat.com, christian.loehle@....com, vincent.donnefort@....com, 
	ke.wang@...soc.com, di.shen@...soc.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH V2 1/2] sched/fair: Prevent cpu_busy_time from exceeding actual_cpu_capacity

On Tue, Jun 25, 2024 at 9:05 PM Vincent Guittot
<vincent.guittot@...aro.org> wrote:
>
> On Mon, 24 Jun 2024 at 10:22, Xuewen Yan <xuewen.yan@...soc.com> wrote:
> >
> > Commit 3e8c6c9aac42 ("sched/fair: Remove task_util from effective utilization in feec()")
> > changed the PD's util from per-CPU to per-PD capping. But because
> > the effective_cpu_util() would return a util which maybe bigger
> > than the actual_cpu_capacity, this could cause the pd_busy_time
> > calculation errors.
>
> I'm still not convinced that this is an error. Your example used for v1 is :
>
> The pd cpus are 4-7, and the arch_scale_capacity is 1024, and because
> of cpufreq-limit, the cpu_actual_cap = 512.
>
> Then the eenv->cpu_cap = 512, the eenv->pd_cap = 2048;
> effective_cpu_util(4) = 1024;
> effective_cpu_util(5) = 1024;
> effective_cpu_util(6) = 256;
> effective_cpu_util(7) = 0;
>
> so env->pd_busy_time = 2304
>
> Even if effective_cpu_util(4) = 1024; is above the current max compute
> capacity of 512, this also means that activity of cpu4 will run twice
> longer . If you cap effective_cpu_util(4) to 512 you miss the
> information that it will run twice longer at the selected OPP. The
> extreme case being:
> effective_cpu_util(4) = 1024;
> effective_cpu_util(5) = 1024;
> effective_cpu_util(6) = 1024;
> effective_cpu_util(7) = 1024;
>
> in this case env->pd_busy_time = 4096
>
> If we cap, we can't make any difference between the 2 cases
>
> Do you have more details about the problem you are facing ?

Because of the cpufreq-limit, the opp was also limited, and when compute_energy:

energy =  ps->cost * sum_util =  ps->cost * eenv->pd_busy_time;

Because of the cpufreq-limit, the ps->cost is the limited-freq's opp's
cost instead of the max freq's cost.
So the energy is determined by pd_busy_time.

Still the example above:

The pd cpus are 4-7, and the arch_scale_capacity is 1024, and because
of cpufreq-limit, the cpu_actual_cap = 512.

Then the eenv->cpu_cap = 512, the eenv->pd_cap = 2048;
effective_cpu_util(4) = 1024;
effective_cpu_util(5) = 1024;
effective_cpu_util(6) = 256;
effective_cpu_util(7) = 0;

Before the patch:
env->pd_busy_time = min(1024+1024+256, eenv->pd_cap) = 2048.
However, because the effective_cpu_util(7) = 0, indeed, the 2048 is bigger than
the actual_cpu_cap.

After the patch:
cpu_util(4) = min(1024, eenv->cpu_cap) = 512;
cpu_util(5) = min(1024, eenv->cpu_cap) = 512;
cpu_util(6) = min(256, eenv->cpu_cap) = 256;
cpu_util(7) = 0;
env->pd_busy_time = min(512+512+256, eenv->pd_cap) = 1280.

As a result, without this patch, the energy is bigger than actual_energy.

And even if cpu4 would run twice longer, the energy may not be equal.
Because:
 *             ps->power * cpu_max_freq
*   cpu_nrg = ------------------------ * cpu_util           (3)
*               ps->freq * scale_cpu

the ps->power = cfv2, and then:

*                  cv2 * cpu_max_freq
*   cpu_nrg = ------------------------ * cpu_util           (3)
*                    scale_cpu

because the limited-freq's voltage is not equal to the max-freq's voltage.

>
>
>
> > So clamp the cpu_busy_time with the eenv->cpu_cap, which is
> > the actual_cpu_capacity.
> >
> > Fixes: 3e8c6c9aac42 ("sched/fair: Remove task_util from effective utilization in feec()")
> > Signed-off-by: Xuewen Yan <xuewen.yan@...soc.com>
> > Tested-by: Christian Loehle <christian.loehle@....com>
> > ---
> > V2:
> > - change commit message.
> > - remove the eenv->pd_cap capping in eenv_pd_busy_time(). (Dietmar)
> > - add Tested-by.
> > ---
> >  kernel/sched/fair.c | 9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 8a5b1ae0aa55..5ca6396ef0b7 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7864,16 +7864,17 @@ static inline void eenv_pd_busy_time(struct energy_env *eenv,
> >                                      struct cpumask *pd_cpus,
> >                                      struct task_struct *p)
> >  {
> > -       unsigned long busy_time = 0;
> >         int cpu;
> >
> > +       eenv->pd_busy_time = 0;
> > +
> >         for_each_cpu(cpu, pd_cpus) {
> >                 unsigned long util = cpu_util(cpu, p, -1, 0);
> >
> > -               busy_time += effective_cpu_util(cpu, util, NULL, NULL);
> > +               util = effective_cpu_util(cpu, util, NULL, NULL);
> > +               util = min(eenv->cpu_cap, util);
> > +               eenv->pd_busy_time += util;
> >         }
> > -
> > -       eenv->pd_busy_time = min(eenv->pd_cap, busy_time);
> >  }
> >
> >  /*
> > --
> > 2.25.1
> >
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ