linux-kernel - Re: [PATCH] sched/pelt: avoid underestimate of task utilization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <39cde23a-19d8-4e64-a1d2-f26bce264883@arm.com>
Date:   Fri, 24 Nov 2023 10:34:59 +0000
From:   Hongyan Xia <hongyan.xia2@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     lukasz.luba@....com, juri.lelli@...hat.com, mingo@...hat.com,
        dietmar.eggemann@....com, peterz@...radead.org, bsegall@...gle.com,
        rostedt@...dmis.org, bristot@...hat.com, mgorman@...e.de,
        vschneid@...hat.com, rafael@...nel.org, qyousef@...alina.io,
        viresh.kumar@...aro.org, linux-kernel@...r.kernel.org,
        linux-pm@...r.kernel.org
Subject: Re: [PATCH] sched/pelt: avoid underestimate of task utilization

On 22/11/2023 17:37, Vincent Guittot wrote:
> The same but with plain text instead of html ...
> 
>   On Wed, 22 Nov 2023 at 17:40, Hongyan Xia <hongyan.xia2@....com> wrote:
>>
>> Hi Vincent,
>>
>> On 22/11/2023 14:01, Vincent Guittot wrote:
>>> It has been reported that thread's util_est can significantly decrease as
>>> a result of sharing the CPU with other threads. The use case can be easily
>>> reproduced with a periodic task TA that runs 1ms and sleeps 100us.
>>> When the task is alone on the CPU, its max utilization and its util_est is
>>> around 888. If another similar task starts to run on the same CPU, TA will
>>> have to share the CPU runtime and its maximum utilization will decrease
>>> around half the CPU capacity (512) then TA's util_est will follow this new
>>> maximum trend which is only the result of sharing the CPU with others
>>> tasks. Such situation can be detected with runnable_avg wich is close or
>>> equal to util_avg when TA is alone but increases above util_avg when TA
>>> shares the CPU with other threads and wait on the runqueue.
>>
>> Thanks for bringing this case up. I'm a bit nervous skipping util_est
>> updates this way. While it is true that this avoids dropping util_est
>> when the task is still busy doing stuff, it also avoids dropping
>> util_est when the task really is becoming less busy. If a task has a
>> legitimate reason to drop its utilization, it looks weird to me that its
>> util_est dropping can be stopped by a new task joining this rq which
>> pushes up runnable_avg.
> 
>   We prefer an util_est that overestimate rather than under estimate
> because in 1st case you will not provide enough performance to the
> task which will remain under provisioned whereas in the other case you
> will create some idle time which will enable to reduce contention and
> as a result reduce the util_est so the overestimate will be transient
> whereas the underestimate will be remain

My concern is mostly about energy efficiency, although I have no 
concrete evidence on energy impact so I'm not firmly against this patch.

> 
>> Also, something about rt-app. Is there an easy way to ask an rt-app
>> thread to achieve a certain amount of throughput (like loops per
>> second)? I think 'runs 1ms and sleeps 100us' may not entirely simulate a
>> task that really wants to preserve a util_est of 888. If its utilization
> 
> 
>   We can do this in rt-app with timer...
Thanks. Looking at the rt-app doc, I think a timer with absolute time 
stamps does what I want.