lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <666857b9-729d-7af3-5d9a-9d9e4c0a68e2@arm.com>
Date:   Fri, 6 Oct 2023 09:46:16 +0100
From:   Lukasz Luba <lukasz.luba@....com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        dietmar.eggemann@....com, rui.zhang@...el.com,
        amit.kucheria@...durent.com, amit.kachhap@...il.com,
        daniel.lezcano@...aro.org, viresh.kumar@...aro.org,
        len.brown@...el.com, pavel@....cz, mhiramat@...nel.org,
        qyousef@...alina.io, wvw@...gle.com
Subject: Re: [PATCH v4 10/18] PM: EM: Add RCU mechanism which safely cleans
 the old data

Hi Rafael,

A change of direction here, regarding your comment below.

On 10/2/23 14:44, Lukasz Luba wrote:
> 
> 
> On 9/29/23 13:59, Rafael J. Wysocki wrote:
>> On Fri, Sep 29, 2023 at 11:36 AM Lukasz Luba <lukasz.luba@....com> wrote:
> 
> [snip]
> 

[snip]

>>>> Apparently, some frameworks are only going to use the default table
>>>> while the runtime-updatable table will be used somewhere else at the
>>>> same time.
>>>>
>>>> I'm not really sure if this is a good idea.
>>>
>>> Runtime table is only for driving the task placement in the EAS.
>>>
>>> The thermal gov IPA won't make better decisions because it already
>>> has the mechanism to accumulate the error that it made.
>>>
>>> The same applies to DTPM, which works in a more 'configurable' way,
>>> rather that hard optimization mechanism (like EAS).
>>
>> My understanding of the above is that the other EM users don't really
>> care that much so they can get away with using the default table all
>> the time, but EAS needs more accuracy, so the table used by it needs
>> to be adjusted in certain situations.
> 
> Yes
> 
>>
>> Fair enough, I'm assuming that you've done some research around it.
>> Still, this is rather confusing.
> 
> Yes, I have presented those ~2y ago in Android Gerrit world
> (got feedback from a few vendors) and in a few Linux conferences.
> 
> For now we don't plan to have this feature for the thermal
> governor or something similar.
> 

I have discussed with one of our partners your comment about 2 tables.
They would like to have this runtime modified EM in other places
as well: DTPM and thermal governor. So you had good gut feeling.

In the past in our IPA (thermal gov ~2016 and kernel v4.14) we
had two callbacks:
- get_static_power() [1]
- get_dynamic_power() [2]

Later ~2017/2018 v4.16 the static power mechanism was removed
completely by this commit 84fe2cab48590e4373978e4e.
The way how it was design, implemented and used justified that
decision. We later used EM in the cpu cooling which also only
had dynamic power information.

The PID mechanism in IPA tries to compensate that
missing information (about changed static power in time or a chip
binning) and adjusts the 'error'. How good and fast that is in all
situations - it's a different story (out of this scope).
So, IPA should not be worse with the runtime table.

The static power was on the chips and probably will be still.
You might remember my slide 13 from OSPM2024 showing two power
usage plots for the same Big CPU and 1.4GHz fixed (50% of fmax):
- w/ GPU working in the background using 1-1.5W
- w/o GPU in the background

The same workload run on Big, but power bigger is ~15% higher
after ~1min.

The static power (leakage) is the issue that this patch tries
to address for EAS. Although, there is not only the leakage.
It's about the whole 'profile', which can be different than what
could be built during boot default information.

So we would want to go for one single table in EM, which
is runtime modifiable.

That is something that you might be more confident and we would
have less diversity (2 tables) in the kernel.

Regards,
Lukasz


[1] 
https://elixir.bootlin.com/linux/v4.14/source/drivers/thermal/cpu_cooling.c#L336
[2] 
https://elixir.bootlin.com/linux/v4.14/source/drivers/thermal/cpu_cooling.c#L383

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ