lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXk5yqGOC8STC4wUNbUyKWR1m1NPOu6K-qXTz0=_UJEe64B2g@mail.gmail.com>
Date:   Wed, 11 Oct 2023 09:02:22 -0700
From:   Wei Wang <wvw@...gle.com>
To:     Lukasz Luba <lukasz.luba@....com>
Cc:     "Rafael J. Wysocki" <rafael@...nel.org>,
        linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        dietmar.eggemann@....com, rui.zhang@...el.com,
        amit.kucheria@...durent.com, amit.kachhap@...il.com,
        daniel.lezcano@...aro.org, viresh.kumar@...aro.org,
        len.brown@...el.com, pavel@....cz, mhiramat@...nel.org,
        qyousef@...alina.io
Subject: Re: [PATCH v4 10/18] PM: EM: Add RCU mechanism which safely cleans
 the old data

On Fri, Oct 6, 2023 at 1:45 AM Lukasz Luba <lukasz.luba@....com> wrote:
>
> Hi Rafael,
>
> A change of direction here, regarding your comment below.
>
> On 10/2/23 14:44, Lukasz Luba wrote:
> >
> >
> > On 9/29/23 13:59, Rafael J. Wysocki wrote:
> >> On Fri, Sep 29, 2023 at 11:36 AM Lukasz Luba <lukasz.luba@....com> wrote:
> >
> > [snip]
> >
>
> [snip]
>
> >>>> Apparently, some frameworks are only going to use the default table
> >>>> while the runtime-updatable table will be used somewhere else at the
> >>>> same time.
> >>>>
> >>>> I'm not really sure if this is a good idea.
> >>>
> >>> Runtime table is only for driving the task placement in the EAS.
> >>>
> >>> The thermal gov IPA won't make better decisions because it already
> >>> has the mechanism to accumulate the error that it made.
> >>>
> >>> The same applies to DTPM, which works in a more 'configurable' way,
> >>> rather that hard optimization mechanism (like EAS).
> >>
> >> My understanding of the above is that the other EM users don't really
> >> care that much so they can get away with using the default table all
> >> the time, but EAS needs more accuracy, so the table used by it needs
> >> to be adjusted in certain situations.
> >
> > Yes
> >
> >>
> >> Fair enough, I'm assuming that you've done some research around it.
> >> Still, this is rather confusing.
> >
> > Yes, I have presented those ~2y ago in Android Gerrit world
> > (got feedback from a few vendors) and in a few Linux conferences.
> >
> > For now we don't plan to have this feature for the thermal
> > governor or something similar.
> >
>
> I have discussed with one of our partners your comment about 2 tables.
> They would like to have this runtime modified EM in other places
> as well: DTPM and thermal governor. So you had good gut feeling.
>
> In the past in our IPA (thermal gov ~2016 and kernel v4.14) we
> had two callbacks:
> - get_static_power() [1]
> - get_dynamic_power() [2]
>
> Later ~2017/2018 v4.16 the static power mechanism was removed
> completely by this commit 84fe2cab48590e4373978e4e.
> The way how it was design, implemented and used justified that
> decision. We later used EM in the cpu cooling which also only
> had dynamic power information.
>
> The PID mechanism in IPA tries to compensate that
> missing information (about changed static power in time or a chip
> binning) and adjusts the 'error'. How good and fast that is in all
> situations - it's a different story (out of this scope).
> So, IPA should not be worse with the runtime table.
>
> The static power was on the chips and probably will be still.
> You might remember my slide 13 from OSPM2024 showing two power
> usage plots for the same Big CPU and 1.4GHz fixed (50% of fmax):
> - w/ GPU working in the background using 1-1.5W
> - w/o GPU in the background
>
> The same workload run on Big, but power bigger is ~15% higher
> after ~1min.
>
> The static power (leakage) is the issue that this patch tries
> to address for EAS. Although, there is not only the leakage.
> It's about the whole 'profile', which can be different than what
> could be built during boot default information.
>
> So we would want to go for one single table in EM, which
> is runtime modifiable.
>
> That is something that you might be more confident and we would
> have less diversity (2 tables) in the kernel.
>
> Regards,
> Lukasz
>
>

Indeed, we had a conversation about this with Lukasz recently. The key
idea is that there is no compelling reason to introduce diversity in
the mathematics involved. If we have confidence in the superior
accuracy of our model, it should be universally implemented. While the
governors are designed with some error tolerance, they can benefit
from enhanced accuracy in their operation.

Thanks!
-Wei

> [1]
> https://elixir.bootlin.com/linux/v4.14/source/drivers/thermal/cpu_cooling.c#L336
> [2]
> https://elixir.bootlin.com/linux/v4.14/source/drivers/thermal/cpu_cooling.c#L383

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ