[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAB8ipk-yAoX5EJ975ZVKfgZP7rP-vzuc3bLVr6yiLtMv26Lxjw@mail.gmail.com>
Date: Thu, 27 Jun 2024 10:02:24 +0800
From: Xuewen Yan <xuewen.yan94@...il.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Xuewen Yan <xuewen.yan@...soc.com>, dietmar.eggemann@....com, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com, qyousef@...alina.io,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com,
vschneid@...hat.com, christian.loehle@....com, vincent.donnefort@....com,
ke.wang@...soc.com, di.shen@...soc.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH V2 1/2] sched/fair: Prevent cpu_busy_time from exceeding actual_cpu_capacity
On Tue, Jun 25, 2024 at 9:05 PM Vincent Guittot
<vincent.guittot@...aro.org> wrote:
>
> On Mon, 24 Jun 2024 at 10:22, Xuewen Yan <xuewen.yan@...soc.com> wrote:
> >
> > Commit 3e8c6c9aac42 ("sched/fair: Remove task_util from effective utilization in feec()")
> > changed the PD's util from per-CPU to per-PD capping. But because
> > the effective_cpu_util() would return a util which maybe bigger
> > than the actual_cpu_capacity, this could cause the pd_busy_time
> > calculation errors.
>
> I'm still not convinced that this is an error. Your example used for v1 is :
>
> The pd cpus are 4-7, and the arch_scale_capacity is 1024, and because
> of cpufreq-limit, the cpu_actual_cap = 512.
>
> Then the eenv->cpu_cap = 512, the eenv->pd_cap = 2048;
> effective_cpu_util(4) = 1024;
> effective_cpu_util(5) = 1024;
> effective_cpu_util(6) = 256;
> effective_cpu_util(7) = 0;
>
> so env->pd_busy_time = 2304
>
> Even if effective_cpu_util(4) = 1024; is above the current max compute
> capacity of 512, this also means that activity of cpu4 will run twice
> longer . If you cap effective_cpu_util(4) to 512 you miss the
> information that it will run twice longer at the selected OPP. The
> extreme case being:
> effective_cpu_util(4) = 1024;
> effective_cpu_util(5) = 1024;
> effective_cpu_util(6) = 1024;
> effective_cpu_util(7) = 1024;
>
> in this case env->pd_busy_time = 4096
>
> If we cap, we can't make any difference between the 2 cases
>
> Do you have more details about the problem you are facing ?
Because of the cpufreq-limit, the opp was also limited, and when compute_energy:
energy = ps->cost * sum_util = ps->cost * eenv->pd_busy_time;
Because of the cpufreq-limit, the ps->cost is the limited-freq's opp's
cost instead of the max freq's cost.
So the energy is determined by pd_busy_time.
Still the example above:
The pd cpus are 4-7, and the arch_scale_capacity is 1024, and because
of cpufreq-limit, the cpu_actual_cap = 512.
Then the eenv->cpu_cap = 512, the eenv->pd_cap = 2048;
effective_cpu_util(4) = 1024;
effective_cpu_util(5) = 1024;
effective_cpu_util(6) = 256;
effective_cpu_util(7) = 0;
Before the patch:
env->pd_busy_time = min(1024+1024+256, eenv->pd_cap) = 2048.
However, because the effective_cpu_util(7) = 0, indeed, the 2048 is bigger than
the actual_cpu_cap.
After the patch:
cpu_util(4) = min(1024, eenv->cpu_cap) = 512;
cpu_util(5) = min(1024, eenv->cpu_cap) = 512;
cpu_util(6) = min(256, eenv->cpu_cap) = 256;
cpu_util(7) = 0;
env->pd_busy_time = min(512+512+256, eenv->pd_cap) = 1280.
As a result, without this patch, the energy is bigger than actual_energy.
And even if cpu4 would run twice longer, the energy may not be equal.
Because:
* ps->power * cpu_max_freq
* cpu_nrg = ------------------------ * cpu_util (3)
* ps->freq * scale_cpu
the ps->power = cfv2, and then:
* cv2 * cpu_max_freq
* cpu_nrg = ------------------------ * cpu_util (3)
* scale_cpu
because the limited-freq's voltage is not equal to the max-freq's voltage.
>
>
>
> > So clamp the cpu_busy_time with the eenv->cpu_cap, which is
> > the actual_cpu_capacity.
> >
> > Fixes: 3e8c6c9aac42 ("sched/fair: Remove task_util from effective utilization in feec()")
> > Signed-off-by: Xuewen Yan <xuewen.yan@...soc.com>
> > Tested-by: Christian Loehle <christian.loehle@....com>
> > ---
> > V2:
> > - change commit message.
> > - remove the eenv->pd_cap capping in eenv_pd_busy_time(). (Dietmar)
> > - add Tested-by.
> > ---
> > kernel/sched/fair.c | 9 +++++----
> > 1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 8a5b1ae0aa55..5ca6396ef0b7 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7864,16 +7864,17 @@ static inline void eenv_pd_busy_time(struct energy_env *eenv,
> > struct cpumask *pd_cpus,
> > struct task_struct *p)
> > {
> > - unsigned long busy_time = 0;
> > int cpu;
> >
> > + eenv->pd_busy_time = 0;
> > +
> > for_each_cpu(cpu, pd_cpus) {
> > unsigned long util = cpu_util(cpu, p, -1, 0);
> >
> > - busy_time += effective_cpu_util(cpu, util, NULL, NULL);
> > + util = effective_cpu_util(cpu, util, NULL, NULL);
> > + util = min(eenv->cpu_cap, util);
> > + eenv->pd_busy_time += util;
> > }
> > -
> > - eenv->pd_busy_time = min(eenv->pd_cap, busy_time);
> > }
> >
> > /*
> > --
> > 2.25.1
> >
> >
Powered by blists - more mailing lists