[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAB8ipk-rDR+06mMWgfzGupm8PK=hgtXn2gsZUGVnXuw2YkkesA@mail.gmail.com>
Date: Mon, 24 Jun 2024 10:27:43 +0800
From: Xuewen Yan <xuewen.yan94@...il.com>
To: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Xuewen Yan <xuewen.yan@...soc.com>, vincent.guittot@...aro.org, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com, rostedt@...dmis.org,
bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com,
vincent.donnefort@....com, qyousef@...alina.io, ke.wang@...soc.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/fair: Prevent cpu_busy_time from exceeding actual_cpu_capacity
On Fri, Jun 21, 2024 at 6:22 PM Dietmar Eggemann
<dietmar.eggemann@....com> wrote:
>
> On 07/06/2024 12:37, Xuewen Yan wrote:
> > On Fri, Jun 7, 2024 at 6:30 PM Dietmar Eggemann
> > <dietmar.eggemann@....com> wrote:
> >>
> >> On 07/06/2024 10:20, Xuewen Yan wrote:
> >>> Hi Dietmar
> >>>
> >>> On Fri, Jun 7, 2024 at 3:19 PM Dietmar Eggemann
> >>> <dietmar.eggemann@....com> wrote:
> >>>>
> >>>> On 06/06/2024 09:06, Xuewen Yan wrote:
> >>>>> Because the effective_cpu_util() would return a util which
> >>>>> maybe bigger than the actual_cpu_capacity, this could cause
> >>>>> the pd_busy_time calculation errors.
> >>>>
> >>>> Doesn't return effective_cpu_util() either scale or min(scale, util)
> >>>> with scale = arch_scale_cpu_capacity(cpu)? So the util sum over the PD
> >>>> cannot exceed eenv->cpu_cap?
> >>>
> >>> In effective_cpu_util, the scale = arch_scale_cpu_capacity(cpu);
> >>> Although there is the clamp of eenv->pd_cap, but let us consider the
> >>> following simple scenario:
> >>> The pd cpus are 4-7, and the arch_scale_capacity is 1024, and because
> >>> of cpufreq-limit,
> >>
> >> Ah, this is due to:
> >>
> >> find_energy_efficient_cpu()
> >>
> >> ...
> >> for (; pd; pd = pd->next)
> >> ...
> >> cpu_actual_cap = get_actual_cpu_capacity(cpu)
> >>
> >> for_each_cpu(cpu, cpus)
> >> ...
> >> eenv.pd_cap += cpu_actual_cap
> >>
> >> and:
> >>
> >> get_actual_cpu_capacity()
> >>
> >> ...
> >> capacity = arch_scale_cpu_capacity(cpu)
> >>
> >> capacity -= max(hw_load_avg(cpu_rq(cpu)), cpufreq_get_pressure(cpu))
> >>
> >> which got introduced by f1f8d0a22422 ("sched/cpufreq: Take cpufreq
> >> feedback into account").
> >
> > I don't think it was introduced by f1f8d0a22422, because f1f8d0a22422
> > just replaced the cpu_thermal_cap with get_actual_cpu_capacity(cpu).
> > The eenv.cpu_cap was introduced by 3e8c6c9aac42 ("sched/fair: Remove
> > task_util from effective utilization in feec()").
>
> Yes, you're right. 3e8c6c9aac42 changed it from per-CPU to per-PD
> capping.
>
> In case we want to go back to per-CPU then we should remove the
> eenv->pd_cap capping in eenv_pd_busy_time().
>
> -->8--
>
> @@ -7864,16 +7864,15 @@ static inline void eenv_pd_busy_time(struct energy_env *eenv,
> struct cpumask *pd_cpus,
> struct task_struct *p)
> {
> - unsigned long busy_time = 0;
> int cpu;
>
> for_each_cpu(cpu, pd_cpus) {
> unsigned long util = cpu_util(cpu, p, -1, 0);
>
> - busy_time += effective_cpu_util(cpu, util, NULL, NULL);
> + util = effective_cpu_util(cpu, util, NULL, NULL);
> + util = min(util, eenv->cpu_cap);
> + eenv->pd_busy_time += util;
> }
> -
> - eenv->pd_busy_time = min(eenv->pd_cap, busy_time);
> }
Okay, the pd-busy clamp is indeed unnecessary.
>
>
>
> I'm wondering whether we would need the:
>
> if (dst_cpu >= 0)
> busy_time = min(eenv->pd_cap, busy_time + eenv->task_busy_time);
>
> in compute_energy() anymore since we only get a candidate CPU in feec()
> after checking with util_fits_cpu() if cpu can accommodate p :
I think this condition is still necessary, because pd_busy_time is
clamped. If this condition is not added, util may exceed
actual_cpu_cap.
>
> feec()
>
> ...
>
> for_each_cpu()
>
> util = cpu_util(cpu, p, cpu, ...)
> cpu_cap = capacity_of()
>
> ...
>
> fits = util_fits_cpu(util, ..., cpu);
> if (!fits)
> continue
>
> /* check if candidate CPU */
Powered by blists - more mailing lists