[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <64f78eec-61fb-447f-bbba-706bd5a54cd7@arm.com>
Date: Thu, 16 Nov 2023 13:14:49 +0000
From: Lukasz Luba <lukasz.luba@....com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: wyes.karny@....com, peterz@...radead.org, linux-pm@...r.kernel.org,
rafael@...nel.org, vschneid@...hat.com, bristot@...hat.com,
bsegall@...gle.com, rostedt@...dmis.org, dietmar.eggemann@....com,
juri.lelli@...hat.com, beata.michalska@....com,
linux-kernel@...r.kernel.org, qyousef@...alina.io,
viresh.kumar@...aro.org, mingo@...hat.com, mgorman@...e.de
Subject: Re: [PATCH v2 1/2] sched/schedutil: rework performance estimation
Hi Vincent,
I know that there is v3, but just to respond to this below.
On 10/31/23 09:48, Vincent Guittot wrote:
> Hi Lukasz,
>
> On Mon, 30 Oct 2023 at 18:45, Lukasz Luba <lukasz.luba@....com> wrote:
>>
>> Hi Vincent,
>>
>> On 10/26/23 18:09, Vincent Guittot wrote:
>>> The current method to take into account uclamp hints when estimating the
>>> target frequency can end into situation where the selected target
>>> frequency is finally higher than uclamp hints whereas there are no real
>>> needs. Such cases mainly happen because we are currently mixing the
>>> traditional scheduler utilization signal with the uclamp performance
>>> hints. By adding these 2 metrics, we loose an important information when
>>> it comes to select the target frequency and we have to make some
>>> assumptions which can't fit all cases.
>>>
>>> Rework the interface between the scheduler and schedutil governor in order
>>> to propagate all information down to the cpufreq governor.
>>>
>>> effective_cpu_util() interface changes and now returns the actual
>>> utilization of the CPU with 2 optional inputs:
>>> - The minimum performance for this CPU; typically the capacity to handle
>>> the deadline task and the interrupt pressure. But also uclamp_min
>>> request when available.
>>> - The maximum targeting performance for this CPU which reflects the
>>> maximum level that we would like to not exceed. By default it will be
>>> the CPU capacity but can be reduced because of some performance hints
>>> set with uclamp. The value can be lower than actual utilization and/or
>>> min performance level.
>>
>> You have probably missed my question in the last v1 patch set.
>
> Yes, sorry
>
>>
>> The description above needs a bit of clarification, since looking at the
>> patches some dark corners are introduced IMO:
>>
>> Currently, we have a less aggressive power saving policy than this
>> proposal.
>>
>> The questions:
>> What if the PD has 4 CPUs, the max util found is 500 and is from a CPU
>> w/ uclamp_max, but there is another CPU with normal utilization 499?
>> What should be the final frequency for that PD?
>
> We now follow the same sequence everywhere which can be summarized by:
>
> for each cpu sharing the same frequency domain:
> util = cpu_util(cpu)
> eff_util = effective_cpu_util(util, &min, &max)
> eff_util = sugov_effective_cpu_perf(eff_util, min, max) which
> applies the dvfs headroom if needed
> max_util = max(max_util, eff_util);
>
> EAS anticipates the impact of the waking task on utilization and max
> but the end result is the same as above once the task is enqueued so I
> didn't show it for simplicity
>
> Coming back to your example
> CPU0 has uclamp_max = 500 and an actual utilization above 500. Its
> eff_util will be 500
> CPU1 doesn't have uclamp_max constraint and an actual utilization of
> 499 which will be increase with dvfs headroom to 623 in
> sugov_effective_cpu_perf()
>
> The final max util will be 623
>
> With the current implementation we apply the dvfs headroom to the
> final max_util (which is the CPU0 with uclamp_max == 500) whereas we
> now apply the dvfs headroom on each CPU inside
> sugov_effective_cpu_perf()
>
> The main difference is that if CPU1 has an actual utilization of 400,
> the max_util of the frequency domain will be 500 whereas it is 625
> after applying dvfs headroom with current implementation
>
>>
>> In current design, where we care more about 'delivered performance
>> to the tasks' than power saving, the +20% would be applied for the
>> frequency. Therefore if that CPU with 499 util doesn't have uclamp_max,
>> it would get a decent amount of idle time for its tasks (to compensate
>> some workload variation).
>
> CPU1 with 499 still gets its 25% margin or I missed something in your example ?
You understood this correctly. I don't have more questions. It should
than work OK.
Thanks,
Lukasz
Powered by blists - more mailing lists