[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9e935645-9baf-af9f-73bd-3eaeaec044a8@arm.com>
Date: Thu, 9 Feb 2023 19:02:12 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Vincent Guittot <vincent.guittot@...aro.org>,
Qais Yousef <qyousef@...alina.io>
Cc: Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel@...r.kernel.org, Lukasz Luba <lukasz.luba@....com>,
Wei Wang <wvw@...gle.com>, Xuewen Yan <xuewen.yan94@...il.com>,
Hank <han.lin@...iatek.com>,
Jonathan JMChen <Jonathan.JMChen@...iatek.com>
Subject: Re: [PATCH v2 1/3] sched/uclamp: Set max_spare_cap_cpu even if
max_spare_cap is 0
On 07/02/2023 10:45, Vincent Guittot wrote:
> On Sun, 5 Feb 2023 at 23:43, Qais Yousef <qyousef@...alina.io> wrote:
>>
>> When uclamp_max is being used, the util of the task could be higher than
>> the spare capacity of the CPU, but due to uclamp_max value we force fit
>> it there.
>>
>> The way the condition for checking for max_spare_cap in
>> find_energy_efficient_cpu() was constructed; it ignored any CPU that has
>> its spare_cap less than or _equal_ to max_spare_cap. Since we initialize
>> max_spare_cap to 0; this lead to never setting max_spare_cap_cpu and
>> hence ending up never performing compute_energy() for this cluster and
>> missing an opportunity for a better energy efficient placement to honour
>> uclamp_max setting.
>>
>> max_spare_cap = 0;
>> cpu_cap = capacity_of(cpu) - task_util(p); // 0 if task_util(p) is high
>>
>> ...
>>
>> util_fits_cpu(...); // will return true if uclamp_max forces it to fit
s/true/1/ ?
>>
>> ...
>>
>> // this logic will fail to update max_spare_cap_cpu if cpu_cap is 0
>> if (cpu_cap > max_spare_cap) {
>> max_spare_cap = cpu_cap;
>> max_spare_cap_cpu = cpu;
>> }
>>
>> prev_spare_cap suffers from a similar problem.
>>
>> Fix the logic by converting the variables into long and treating -1
>> value as 'not populated' instead of 0 which is a viable and correct
>> spare capacity value.
The issue I see here is in case we don't have any spare capacity left,
the energy calculation (in terms CPU utilization) isn't correct anymore.
Due to `effective_cpu_util()` returning `max=arch_scale_cpu_capacity()`
you never know how big the `busy_time` for the PD really is in this moment.
eenv_pd_busy_time()
for_each_cpu(cpu, pd_cpus)
busy_time += effective_cpu_util(..., ENERGY_UTIL, ...)
^^^^^^^^^
with:
sum_util = min(busy_time + task_busy_time, pd_cap)
^^^^^^^^^
freq = (1.25 * max_util / max) * max_freq
energy = (perf_state(freq)->cost / max) * sum_util
energy is not related to CPU utilization anymore (since there is no idle
time/spare capacity) left.
So EAS keeps packing on the cheaper PD/clamped OPP.
E.g. Juno-r0 [446 1024 1024 446 446 446] with 6 8ms/16ms uclamp_max=440
tasks all running on little PD, cpumask=0x39. The big PD is idle but
never beats prev_cpu in terms of energy consumption for the wakee.
[...]
Powered by blists - more mailing lists