[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bae6421c-5192-44e6-9ed9-1d32e7122cb7@arm.com>
Date: Tue, 19 Mar 2024 17:05:26 +0000
From: Hongyan Xia <hongyan.xia2@....com>
To: Dietmar Eggemann <dietmar.eggemann@....com>,
Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Juri Lelli <juri.lelli@...hat.com>, Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Valentin Schneider <vschneid@...hat.com>
Cc: Qais Yousef <qyousef@...alina.io>,
Morten Rasmussen <morten.rasmussen@....com>,
Lukasz Luba <lukasz.luba@....com>,
Christian Loehle <christian.loehle@....com>, linux-kernel@...r.kernel.org,
David Dai <davidai@...gle.com>, Saravana Kannan <saravanak@...gle.com>
Subject: Re: [RFC PATCH v2 1/7] Revert "sched/uclamp: Set max_spare_cap_cpu
even if max_spare_cap is 0"
On 19/03/2024 15:34, Dietmar Eggemann wrote:
> On 01/02/2024 14:11, Hongyan Xia wrote:
>> From: Hongyan Xia <Hongyan.Xia2@....com>
>>
>> That commit creates further problems because 0 spare capacity can be
>> either a real indication that the CPU is maxed out, or the CPU is
>> UCLAMP_MAX throttled, but we end up giving all of them a chance which
>> can results in bogus energy calculations. It also tends to schedule
>> tasks on the same CPU and requires load balancing patches. Sum
>> aggregation solves these problems and this patch is not needed.
>>
>> This reverts commit 6b00a40147653c8ea748e8f4396510f252763364.
>
> I assume you did this revert especially for the 'Scenario 5: 8 tasks
> with UCLAMP_MAX of 120' testcase?
More or less. Actually you can already see the problem in Scenario 1.
Ideally the 4 uclamp_max tasks should be evenly distributed on 4 little
CPUs, but from time to time task placement places more than 1 such task
on the same CPU, leaving some other little CPUs not occupied.
> IMHO, the issue is especially visible in compute_energy()'s busy_time
> computation with a valid destination CPU (dst_cpu >= 0). I.e. when we
> have to add performance domain (pd) and task busy time.
>
> find_energy_efficient_cpu() (feec())
>
> for each pd
> for each cpu in pd
>
> set {prev_,max}_spare_cap
>
> bail if prev_ and max_spare_cap < 0 (was == 0 before )
>
> {base_,prev_,cur_}energy = compute_energy
>
> So with the patch we potentially compute energy for a saturated PD
> according:
>
> compute_energy()
>
> if (dst_cpu >= 0)
> busy_time = min(eenv->pd_cap, eenv->busy_time + eenv->task_busy_time)
> <----(a)---> <--------------(b)------------------->
>
> energy = em_cpu_energy(pd->em_pd, max_util, busy_time, eenv->cpu_cap)
>
> If (b) > (a) then we're saturated and 'energy' is bogus.
Yeah, I think what's happening is because placing more tasks on the same
CPU won't increase energy computation, so in the end task placement
thinks it's the better decision. The root issue is that once you have
uclamp_max, you can theoretically fit an infinite number of such tasks
on the same CPU.
> The way to fix this is up for discussion:
>
> (1) feec() returning prev_cpu
> (2) feec() returning -1 (forcing wakeup into sis() -> sic())
> (3) using uclamped values for task and rq utilization
>
> None of those have immediately given the desired task placement on
> mainline (2 tasks on each of the 4 little CPUs and no task on the 2 big
> CPUs on my [l B B l l l] w/ CPU capacities = [446 1024 1024 446 446 446]
> machine) you can achieve with uclamp sum aggregation.
Personally from the results I've seen I definitely prefer (3), although
(3) has other problems. One thing is that sum aggregation pushes up
utilization with uclamp_min, but its energy consumption definitely won't
be that high. The real energy is between its util_avg and util_avg_uclamp.
I haven't seen this as a real problem, but maybe we can see even better
task placement if this is accounted for.
> [...]
Powered by blists - more mailing lists