linux-kernel - Re: [PATCH] sched/documentation: elaborate on uclamp limitations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <79127e9c-8583-8356-a9db-b9af74b6a1b0@arm.com>
Date:   Tue, 23 May 2023 16:39:27 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>,
        Hongyan Xia <hongyan.xia2@....com>
Cc:     Qais Yousef <qyousef@...alina.io>,
        Jonathan Corbet <corbet@....net>, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCH] sched/documentation: elaborate on uclamp limitations

On 23/05/2023 11:23, Vincent Guittot wrote:
> On Thu, 18 May 2023 at 14:42, Hongyan Xia <hongyan.xia2@....com> wrote:
>>
>> Hi Qais,
>>
>> On 2023-05-18 12:30, Qais Yousef wrote:
>>> Please CC sched maintainers (Ingo + Peter) next time as they should pick this
>>> up ultimately and they won't see it from the list only.
>>
>> Will do. I was using the get_maintainers script and I thought that gave
>> me all the CCs.
>>
>>> On 05/05/23 16:24, Hongyan Xia wrote:

[...]

>>>> diff --git a/Documentation/scheduler/sched-util-clamp.rst b/Documentation/scheduler/sched-util-clamp.rst
>>>> index 74d5b7c6431d..524df07bceba 100644
>>>> --- a/Documentation/scheduler/sched-util-clamp.rst
>>>> +++ b/Documentation/scheduler/sched-util-clamp.rst
>>>> @@ -669,6 +669,19 @@ but not proportional to Fmax/Fmin.
>>>>
>>>>           p0->util_avg = 300 + small_error
>>>>
>>>> +The reason why util_avg is around 300 even though it runs for 900 at Fmin is:
> 
> What does it mean running for 900 at Fmin ? util_avg is a ratio in the
> range [0:1024] without time unit
> 
>>>> +Although running at Fmin reduces the rate of rq_clock_pelt() to 1/3 thus
>>>> +accumulates util_sum at 1/3 of the rate at Fmax, the clock period
>>>> +(rq_clock_pelt() now minus previous rq_clock_pelt()) in:
>>>> +
>>>> +::
>>>> +
>>>> +        util_sum / clock period = util_avg
> 
> I don't get the meaning of the formula above ? There is no "clock
> period" (although I'm not sure what it means here) involved when
> computing util_avg

I also didn't get this one. IMHO. the relation between util_avg and
util_sum is `divider  = LOAD_AVG_MAX - 1024 + avg->period_contrib`. But
I can't see how this matters here.

The crucial point here is IMHO as long we have idle time (p->util_avg <
CPU (current) capacity) the util_avg will not raise to 1024 since at
wakeup util_avg will be only decayed (since the task was sleeping, i.e.
!!se->on_rq = 0). And we are scale invariant thanks to the functionality
in update_rq_clock_pelt() (which is executed when p is running).

The pelt clock update at this moment (wakeup) is setting clock_pelt to
clock_task since rq->curr is the idle task but IMHO that is not the
reason why p->util_avg behaves like this.

The moment `p->util_avg >= CPU (current) capacity` there is no idle time
left, i.e. no such `only decay` updates happens for p anymore (only
`accrue/decay` updates in tick) and the result is that p->util_avg goes
1024.

> Also, there is no linear relation between util_avg and Fmin/Fmax
> ratio. Fmin/Fmax ratio is meaningful in regards to the ratio between
> running time and period time of a periodic task. I understand the
> reference of pelt in this document as a quite simplified description
> of PELT so I'm not sure that adding a partial explanation will help.
> It will probably cause more confusion to people. The only thing that
> is sure, is that PELT expects some idle time to stay fully invariant
> for periodic task

+1 ... we have to be able to understand the code. BTW, schedutil.rst has
also paragraphs about PELT and `Frequency / CPU Invariance` and also
refers to kernel/sched/pelt.h:update_rq_clock_pelt() for details.

[...]