lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 19 Oct 2020 12:10:54 +0100
From:   Lukasz Luba <lukasz.luba@....com>
To:     Viresh Kumar <viresh.kumar@...aro.org>
Cc:     Vincent Guittot <vincent.guittot@...aro.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Zhang Rui <rui.zhang@...el.com>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        Amit Daniel Kachhap <amit.kachhap@...il.com>,
        Javi Merino <javi.merino@...nel.org>,
        Amit Kucheria <amit.kucheria@...durent.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Quentin Perret <qperret@...gle.com>,
        Rafael Wysocki <rjw@...ysocki.net>,
        "open list:THERMAL" <linux-pm@...r.kernel.org>
Subject: Re: [PATCH 2/2] thermal: cpufreq_cooling: Reuse effective_cpu_util()



On 10/19/20 8:40 AM, Viresh Kumar wrote:
> On 30-07-20, 12:16, Lukasz Luba wrote:
>> Hi Viresh,
>>
>> On 7/30/20 7:24 AM, Viresh Kumar wrote:
>>> On 17-07-20, 11:46, Vincent Guittot wrote:
>>>> On Thu, 16 Jul 2020 at 16:24, Lukasz Luba <lukasz.luba@....com> wrote:
>>>>> On 7/16/20 12:56 PM, Peter Zijlstra wrote:
>>>>>> Currently cpufreq_cooling appears to estimate the CPU energy usage by
>>>>>> calculating the percentage of idle time using the per-cpu cpustat stuff,
>>>>>> which is pretty horrific.
>>>>>
>>>>> Even worse, it then *samples* the *current* CPU frequency at that
>>>>> particular point in time and assumes that when the CPU wasn't idle
>>>>> during that period - it had *this* frequency...
>>>>
>>>> So there is 2 problems in the power calculation of cpufreq cooling device :
>>>> - How to get an accurate utilization level of the cpu which is what
>>>> this patch is trying to fix because using idle time is just wrong
>>>> whereas scheduler utilization is frequency invariant
>>>
>>> Since this patch is targeted only towards fixing this particular
>>> problem, should I change something in the patch to make it acceptable
>>> ?
>>>
>>>> - How to get power estimate from this utilization level. And as you
>>>> pointed out, using the current freq which is not accurate.
>>>
>>> This should be tackled separately I believe.
>>>
>>
>> I don't think that these two are separate. Furthermore, I think we
>> would need this kind of information also in future in the powercap.
>> I've discussed with Daniel this possible scenario.
>>
>> We have a vendor who presented issue with the IPA input power and
>> pointed out these issues. Unfortunately, I don't have this vendor
>> phone but I assume it can last a few minutes without changing the
>> max allowed OPP. Based on their plots the frequency driven by the
>> governor is changing, also the idles are present during the IPA period.
>>
>> Please give me a few days, because I am also plumbing these stuff
>> and would like to present it. These two interfaces: involving cpufreq
>> driver or fallback mode for utilization and EM.
> 
> Its been almost 3 months, do we have any update for this? We really
> would like to get this patchset merged in some form as it provides a
> simple update and I think more work can be done by anyone over it in
> future.
> 

I made a few implementations to compare the results with reality (power
measured using power meter on cluster rails). This idea with utilization
from the schedutil_cpu_util() has some edge cases with errors. The
signal is good for comparison and short prediction, but taking it as an
approximation for past arbitrary period (e.g. 100ms) has issues. It is
good when estimating energy cost during e.g. compute_energy().

What your renamed function of old schedutil_cpu_util() does is returning
the sum of utilization of runqueues (CFS, RT, DL, (IRQ)) at that
time. This utilization is dependent on sum of utilization of tasks being
there. These tasks could shuffle in the past (especially when we deal
with period ~100ms in IPA)...

I am currently working on a few different topics, not full time on this
one. Thus, I tend to agree that this provides 'simple update and ...
more work can be done' in future. Although, I am a bit concerned that it
would require some exports from the scheduler, some changed to
schedutil, which I am not sure they would pay off.

If Rafael and Peter will allow you to change these sub-systems, then I
don't mind.

What I am trying to implement is different than this idea.

Regards,
Lukasz


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ