linux-kernel - Re: [RFC PATCH 0/6] sched: uclamp sum aggregation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <7f0bfaa5-22c3-4ed3-bf34-cb7ad6b4c335@arm.com>
Date:   Tue, 5 Dec 2023 17:23:54 +0000
From:   Hongyan Xia <hongyan.xia2@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Qais Yousef <qyousef@...alina.io>, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Lukasz Luba <lukasz.luba@....com>,
        Christian Loehle <christian.loehle@....com>,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/6] sched: uclamp sum aggregation

On 05/12/2023 16:26, Vincent Guittot wrote:
> On Tue, 5 Dec 2023 at 16:19, Hongyan Xia <hongyan.xia2@....com> wrote:
>>
>> On 04/12/2023 16:12, Vincent Guittot wrote:
>>> On Mon, 4 Dec 2023 at 02:48, Hongyan Xia <hongyan.xia2@....com> wrote:
>>>>
>>>> [...]
>>>>
>>>> Other shortcomings are not that critical, but the fact that uclamp_min's
>>>> effectiveness is divided by N under max aggregation I think is not
>>>> acceptable.
>>>
>>> Change EAS task placement policy in this case to take into account
>>> actual utilization and uclamp_min/max
>>
>> Thank you. I agree. I want to emphasize this specifically because this
>> is exactly what I'm trying to do. The whole series can be rephrased in a
>> different way:
>>
>> - The PELT signal is distorted when uclamp is active.
> 
> Sorry but no it's not >> - Let's consider the [PELT, uclamp_min, uclamp_max] tuple.
> 
> That's what we are already doing with effective_cpu_util. We might
> want to improve how we use them in EAS but that's another story than
> your proposal

It's different. We never catch how we *got* the PELT value. If we wake 
up a task, what we do now is to have the following:

[p->util_avg, p->uclamp_min, p->uclamp_max, target_rq->uclamp_min, 
target_rq->uclamp_max]

But to best understand how big this task really was, we want:

[p->util_avg, previous_rq->uclamp_min_back_then, 
previous_rq->uclamp_max_back_then]

Without such information, issues cannot be avoided because we have no 
idea how big the task really was. Frequency spikes is just one of the 
symptoms when we mis-interpret how big the task was.

>> - Always carrying all three variables is too much, but [PELT,
>> clamped(PELT)] is an approximation that works really well.
> 
> As said before. It's a no go for this mix

I see your concern. To rephrase this series again, I'm simply arguing that

[p->util_avg, previous_uclamp_min, previous_uclamp_max]

is better than

[p->util_avg, p->uclamp_min, p->uclamp_max, target_rq->uclamp_min, 
target_rq->uclamp_max] plus the code to mitigate the issues

in estimating how big the task is.

I anticipate this series to be significantly smaller than the current 
max aggregation approach plus future code to mitigate the problems, but 
I'll keep trying to improve it to hopefully address your concerns.

>>
>> Of course, I'll explore if there's a way to make things less messy. I
>> just realized why I didn't do things util_est way but instead directly
>> clamping on PELT, it's because util_est boosts util_avg and can't work
>> for uclamp_max. I'll keep exploring options.
>>
>>>> [...]